Job Description

Bring cutting-edge machine learning to production: from data pipelines and annotationstrategy to model training, deployment, and monitoring. You’ll help us build reliablecomputer vision systems and next‑gen multimodal AI that understand images and text.

What you’ll do:

• Own the data lifecycle: design and maintain pipelines for data extraction,cleaning, and preprocessing at scale (batch & streaming).

• Partner on data annotation: define labeling guidelines, set QA criteria, runaudits, and implement active learning loops to reduce labeling cost and speedup iteration.

• Build CV models: ship models for image processing, segmentation, and objectdetection; run experiments, ablations, and error analyses to continuouslyimprove accuracy and latency.

• Develop analytical tools: create internal dashboards, notebooks, and utilitiesto visualize datasets, track experiments, and monitor model health inproduction.

• Engineer for reliability: refactor and improve code structure, enforce testing &typing, and contribute to architecture/design reviews.

• MLOps end‑to‑end: containerize training/inference, automate CI/CD, managemodel registries, and set up observability (drift, performance, cost).

• Multimodal LLMs/VLMs: prototype and productionize vision‑languagecapabilities (e.g., visual Q&A, captioning, OCR understanding,retrieval‑augmented reasoning over images + text).

Requirements

Required qualifications:

• Solid understanding of the data science process and experimental rigor(hypothesis design, baselines, A/B, statistical significance).

• Proven Python engineering skills (clean code, testing, packaging); strong withNumPy/Pandas, OpenCV, and one major DL framework (PyTorch orTensorFlow).

• Hands‑on experience with image processing, segmentation, and objectdetection (classical + deep learning approaches).

• Cloud ML experience (AWS/GCP/Azure) and containerization/orchestration(Docker; familiarity with job schedulers or Kubernetes is a plus).

• Experience with MLOps practices (experiment tracking, model registries,reproducible training, monitoring) and active learning for dataset curation.

• Fluency in math & statistics (probability, optimization, linear algebra).

• Master’s degree in Computer Science or a related quantitative field (orequivalent practical experience)

Multimodal LLM / Vision‑Language experience (plus):

• Fine‑tuning or instruct‑tuning multimodal LLMs/VLMs; integrating visionencoders with language backbones.

• Building image+text datasets (captions, OCR, layout) and designingalignment/evaluation protocols for multimodal tasks.

• Prompt design, tool‑use/RAG over images and text, safety/guardrail strategies,and latency/cost optimization for inference

Nice to have:

• Experience with labeling tools (e.g., Label Studio, CVAT) and data qualityframeworks.

• Knowledge of modern CV model families (e.g., transformer‑baseddetection/segmentation) and techniques like distillation/quantization.

• Familiarity with API development (FastAPI/Flask) and basic frontend skills forinternal dashboards.

• Background with synthetic data generation or simulation‑based augmentation.

Company offers

What we offer:

• Modern tools and an environment optimized for high‑impact ML work.

• Flexible working hours and a supportive, professional international team.

• Budget for training & conferences, plus time for learning and exploration.

Key benefits

Additional Health Compensation

Flexible Working Time

Sports Compensation