Data Engineer Python (Remote)

Yuno AI · Milano, Provincia di milano, Italia · · 55.000€ - 70.000€


Descrizione dell'offerta

Cerchiamo un Data Engineer Fortissimo che si unisca al nostro team (3 sviluppatori al momento, in forte espansione). Fortissimo vuol dire che conosci le pipeline di ETL e i data warehouse come le tue tasche. Trattiamo dati eterogenei (web scraping, pdf, fonti già strutturate, enrichments tramite llm, transformer per embeddings)

Il tuo ruolo sarà quello di strutturare da (quasi) zero la nostra infrastruttura dati e il team di data engineering. Il nostro stack attuale (GCP + Python + Cloud Run + Postgres) regge ma non basta. Inglese C1 - siamo un'azienda internazionale, non tutti i colleghi parlano italiano
We're looking for a really strong Data Engineer to join our team (3 developers right now, growing fast). By "seriously strong" we mean you know ETL pipelines and data warehouses inside out. We deal with heterogeneous data (web scraping, PDFs, already-structured sources, LLM-based enrichments, transformers for embeddings).

Your role will be to build our data infrastructure and data engineering team from (almost) scratch. Our current stack (GCP + Python + Cloud Run + Postgres) holds up, but it's not enough. We're starting to handle large volumes of data from extremely heterogeneous sources, and your role will be absolutely key to the company's growth (you'll read this in any job ad, but with us it's actually true). Let's start, try it, and see. Maybe it'll make sense, maybe it won't — but it wastes less time to try than to sit through endless meetings. Compensation: €55–70k gross annual salary + stock options (open to freelance/VAT arrangements too).
Very Good English
Based in Milan — we're flexible on remote work, but you'll need to come into the office from time to time.

Yunoai is a startup redefining how banks, funds, and advisory firms analyze and compare companies. Our product lets you search, find, and compare companies semantically — not by keyword, but by meaning. Under the hood: proprietary algorithms, cutting-edge tech (we're among the first to integrate MCP), and a huge amount of data from heterogeneous sources that we have to collect, clean, and make useful.

Candidatura e Ritorno (in fondo)