DevOps & Infrastructure Architect

7+ years of experience
Foetz, Luxembourg
Full-time, On-site

About the Role

We are currently looking for a DevOps & Infrastructure Architect for our client  - an innovative company working on advanced UAV technologies and autonomous aerial systems.
As our DevOps & Infrastructure Architect, you will own the full stack beneath the engineering organisation – from physical rack layouts and server specifications through to Kubernetes clusters, CI/CD pipelines, MLOps platforms, and the security controls that tie it all together. This is a hands-on architecture role: you will design the blueprint and build, deploy, and operate the running infrastructure. You serve embedded engineers, ML researchers, flight-software developers, and data analysts – teams whose workloads look nothing like a typical web-app shop. If you want to be the person who makes a defence-grade autonomous-systems company actually ship, this is the role.

Responsibilities

  • IT systems architecture – Design the full infrastructure stack: service topology, network segmentation, trust zones, rack layouts, power/cooling, and cabling. Maintain architecture docs, data-flow diagrams, and capacity plans the entire engineering org depends on.

  • Hardware lifecycle management – Own assets from specification and procurement through deployment, maintenance, and decommissioning. Build hardware roadmaps tied to engineering growth forecasts so capacity is never the bottleneck.

  • Server infrastructure – Size and deploy on-prem and hybrid compute environments serving concurrent workloads: GPU clusters for ML training and inference, tiered storage for telemetry and sensor datasets, analytics and BI tooling, development platforms, and local LLM deployment.

  • Development platform administration – Deploy and operate self-hosted engineering tooling – primarily GitLab (SCM, CI/CD, registries, issue tracking) – plus artefact repositories, SAST scanners, and documentation platforms. Own high availability, backup/restore, and upgrade lifecycle for all dev tools.

  • MLOps infrastructure – Build and maintain the ML lifecycle stack: experiment tracking, model registries, GPU-scheduled training orchestration, dataset versioning, and model serving pipelines. Ensure full reproducibility and auditability from data ingestion to edge deployment.

  • Containerisation and orchestration – Design and operate the container platform: Docker builds pipelines, private registries, and Kubernetes clusters (RBAC, network policies, resource quotas, persistent storage, ingress). Manage deployments via Helm/Kustomize with GitOps-driven delivery.

  • CI/CD pipeline architecture – Design scalable pipelines across heterogeneous codebases: embedded firmware with cross-compilation and HIL triggers, Python/C++ services, ML training jobs, and infrastructure-as-code. Implement secret management, signed artefacts, SBOM generation, runner fleet management, and parallelisation.

  • Agile tooling and workflow support – Configure and maintain tooling for Agile workflows: issue tracking, sprint boards, branch-per-ticket and merge-request workflows with automated status transitions, CI-enforced Definition of Done, and engineering metrics dashboards.

  • IT security architecture – Design and enforce security across the stack: network segmentation, firewall management, VPN/zero-trust access, IAM (LDAP/AD, SSO, MFA), PAM, endpoint security, vulnerability scanning, patch management, encrypted storage and transport, SIEM integration, and incident response. Support air-gapped environments where required.

  • Monitoring, observability, and reliability – Implement infrastructure monitoring, log management, and alerting. Define SLAs/SLOs, build tested DR and business-continuity plans with clear RTO/RPO targets and failover procedures.

  • Infrastructure as Code and automation – Manage all infrastructure through code: provisioning, config management, and automation scripting. Every infrastructure change goes through code review and CI validation before touching production.

Qualifications

  • BS/MS in Computer Science, IT, Systems Engineering, or equivalent practical experience

  • 7+ years in infrastructure, DevOps, or platform engineering; 3+ years in an architecture or tech-lead role

  • Deep Linux admin (RHEL/Ubuntu/Debian): systemd, kernel tuning, LVM, production troubleshooting

  • Physical and logical infrastructure design: compute/storage sizing, VLAN/subnet layout, firewall management, rack capacity planning

  • Kubernetes in production (k8s, k3s, or OpenShift): cluster architecture, RBAC, NetworkPolicy, PV provisioning (Ceph, Longhorn, NFS), Ingress (NGINX, Traefik), resource quotas

  • GitOps delivery with ArgoCD or Flux: Helm/Kustomize across dev/staging/prod, drift detection, and rollback

  • CI/CD pipeline design at scale with GitLab CI (or Jenkins/GitHub Actions): pipeline topology, runner fleet management, Kaniko/DinD builds, artefact signing, SBOM generation, secret management (Vault, CI variables)

  • Infrastructure-as-Code: Terraform or OpenTofu for provisioning, Ansible or Salt for config management, Bash/Python for automation

  • Security architecture: LDAP/AD with SSO (SAML/OIDC, Keycloak), MFA, VPN/zero-trust (WireGuard, Tailscale, Cloudflare Access), TLS management, LUKS, vulnerability scanning (Trivy, Grype, OpenVAS), patch workflows

  • Observability stack experience: metrics, log aggregation, alerting, and infrastructure monitoring

  • Storage architecture: NAS/SAN (TrueNAS, NetApp), object storage (MinIO/S3-compatible), tiered hot/warm/cold strategies, backup/recovery (Restic, Borg, Velero) for multi-terabyte datasets

  • Container expertise: multi-stage Docker builds, layer caching, private registry operation (Harbor, GitLab Registry), image scanning, runtime security

  • Ability to translate engineering needs into infrastructure designs, trade-off analyses, and capacity roadmaps

  • Clear technical documentation: architecture diagrams (draw.io, Mermaid), runbooks, operational procedures, post-incident reviews

  • Hardware procurement and vendor management experience (servers, storage, networking, UPS)

  • Comfortable in security-sensitive or defence-adjacent environments (access controls, audits, need-to-know policies)

  • Collaborative mindset – infrastructure as a service to engineering teams

  • English: Upper Intermediate or higher.

Will be a plus 

  • MLOps infrastructure: MLflow/Weights&Biases for experiment tracking, model registries, training orchestration (Kubeflow, Ray, SLURM), dataset versioning (DVC, LakeFS), model serving (Triton, Torch Serve), GPU operator and MIG partitioning

  • Embedded and real-time systems CI: cross-compilation toolchains, hardware-in-the-loop (HIL) test integration, firmware signing and OTA update pipelines

  • Edge deployment infrastructure: OTA update systems, lightweight container runtimes, remote management and telemetry collection from deployed UAVs

  • Data engineering support: pipeline orchestration (Airflow, Prefect), data lake architecture, time-series and telemetry storage (InfluxDB, TimescaleDB), BI and analytics tooling (Grafana, Metabase)

  • Regulated or certified environments: ISO 27001, SOC 2, DO-178C awareness; experience with audit trails, change management, and evidence collection

  • Experience at a defence, aerospace, or deep-tech hardware company

  • Familiarity with Luxembourg's regulatory and data-sovereignty landscape

  • Additional European language (French, German, or Luxembourgish)

What We Offer

  • Office-based work in Luxembourg (5 days per week).

  • Relocation assistance.

  • 26 days of paid vacation.

  • Medical insurance and sick leave covered by the Luxembourg national healthcare system.

  • Clear work-life balance policy with no overtime culture.

Attach a CV file (PDF, DOC)

Similar vacancies

Senior Oracle BRM Developer

3+ years of experience
Ukraine or EU
Remote, Full-time

Sales Manager

5+ years of experience
Europe
Remote

GenAI Engineer

3+ years of experience
Remote
contract, remote