Product Company
New app with a mission to connect people in real life to enjoy time together. It helps users to make new meaningful connections by hosting and joining real-life, offline experiences.
Responsibilities
Your mission will be to deliver clean, reliable and timely data to internal stakeholders:
- Provide relevant data for machine learning engineers to explore, build models and feed signals to personalisation systems (feed ranking, push notifications, friend suggestions, etc.);
- Enable business users to understand how the product is performing and empower them to do their own data exploration by feeding the right data to BI reporting, product analytics and CRM tools.
Data stack:
- PostgreSQL;
- Cloud Firestore;
- Fivetran;
- Segment;
- BigQuery;
- dbt;
- Amplitude;
- Metabase;
- Dagster.
Most of our infrastructure is in GCP. We prefer to use modern managed cloud-based solutions to maximise dev efficiency.
Tasks:
- Communicate with cross-functional stakeholders to reach alignment on goals;
- Work with mobile and backend engineering teams to make sure that the right customer data is collected;
- Ensure that all first- and third-party data is landed in a data warehouse;
- Build data pipelines to transform the raw data into reusable models;
- Maintain data documentation and definitions;
- Implement data quality control with monitoring and alerting;
- Apply software engineering best practices to analytics code: version control, code reviews, testing, continuous integration;
- Implement regulatory compliance (GDPR, CCPA, etc.), such as anonymisation of PII;
- Propose and implement changes in data architecture in collaboration with the Personalisation team.
Requirements:
- Be proficient in SQL. Write clean and efficient code to extract reusable signals from raw data;
- Communicate clearly and effectively, including written form. Be able to write clear proposals, seek alignment, request and act on feedback;
- Be proactive, take ownership, be able to tolerate and resolve ambiguity;
- Basic Python skills.
Would be nice to have:
- Experience working with dbt models to transform data in the warehouse;
- Comfortable working with git;
- Experience in building stream processing pipelines (e.g. GCP Dataflow, Spark Streaming, Flink);
- Experience with implementing workflow orchestration (e.g. Airflow or Prefect);
- Experience with implementing data quality control (e.g. Great Expectations);
- Experience in designing and building data stack;
- Experience in backend infrastructure design. Understanding of container and orchestration technologies: Docker, Kubernetes;
- Experience working with data scientists and machine learning engineers, understanding of ML workflows.