DevOps & Infrastructure Architect
About the Role
We are currently looking for a DevOps & Infrastructure Architect for our client - an innovative company working on advanced UAV technologies and autonomous aerial systems.
As our DevOps & Infrastructure Architect, you will own the full stack beneath the engineering organisation – from physical rack layouts and server specifications through to Kubernetes clusters, CI/CD pipelines, MLOps platforms, and the security controls that tie it all together. This is a hands-on architecture role: you will design the blueprint and build, deploy, and operate the running infrastructure. You serve embedded engineers, ML researchers, flight-software developers, and data analysts – teams whose workloads look nothing like a typical web-app shop. If you want to be the person who makes a defence-grade autonomous-systems company actually ship, this is the role.
Responsibilities
-
IT systems architecture – Design the full infrastructure stack: service topology, network segmentation, trust zones, rack layouts, power/cooling, and cabling. Maintain architecture docs, data-flow diagrams, and capacity plans the entire engineering org depends on.
-
Hardware lifecycle management – Own assets from specification and procurement through deployment, maintenance, and decommissioning. Build hardware roadmaps tied to engineering growth forecasts so capacity is never the bottleneck.
-
Server infrastructure – Size and deploy on-prem and hybrid compute environments serving concurrent workloads: GPU clusters for ML training and inference, tiered storage for telemetry and sensor datasets, analytics and BI tooling, development platforms, and local LLM deployment.
-
Development platform administration – Deploy and operate self-hosted engineering tooling – primarily GitLab (SCM, CI/CD, registries, issue tracking) – plus artefact repositories, SAST scanners, and documentation platforms. Own high availability, backup/restore, and upgrade lifecycle for all dev tools.
-
MLOps infrastructure – Build and maintain the ML lifecycle stack: experiment tracking, model registries, GPU-scheduled training orchestration, dataset versioning, and model serving pipelines. Ensure full reproducibility and auditability from data ingestion to edge deployment.
-
Containerisation and orchestration – Design and operate the container platform: Docker builds pipelines, private registries, and Kubernetes clusters (RBAC, network policies, resource quotas, persistent storage, ingress). Manage deployments via Helm/Kustomize with GitOps-driven delivery.
-
CI/CD pipeline architecture – Design scalable pipelines across heterogeneous codebases: embedded firmware with cross-compilation and HIL triggers, Python/C++ services, ML training jobs, and infrastructure-as-code. Implement secret management, signed artefacts, SBOM generation, runner fleet management, and parallelisation.
-
Agile tooling and workflow support – Configure and maintain tooling for Agile workflows: issue tracking, sprint boards, branch-per-ticket and merge-request workflows with automated status transitions, CI-enforced Definition of Done, and engineering metrics dashboards.
-
IT security architecture – Design and enforce security across the stack: network segmentation, firewall management, VPN/zero-trust access, IAM (LDAP/AD, SSO, MFA), PAM, endpoint security, vulnerability scanning, patch management, encrypted storage and transport, SIEM integration, and incident response. Support air-gapped environments where required.
-
Monitoring, observability, and reliability – Implement infrastructure monitoring, log management, and alerting. Define SLAs/SLOs, build tested DR and business-continuity plans with clear RTO/RPO targets and failover procedures.
-
Infrastructure as Code and automation – Manage all infrastructure through code: provisioning, config management, and automation scripting. Every infrastructure change goes through code review and CI validation before touching production.
Qualifications
-
BS/MS in Computer Science, IT, Systems Engineering, or equivalent practical experience
-
7+ years in infrastructure, DevOps, or platform engineering; 3+ years in an architecture or tech-lead role
-
Deep Linux admin (RHEL/Ubuntu/Debian): systemd, kernel tuning, LVM, production troubleshooting
-
Physical and logical infrastructure design: compute/storage sizing, VLAN/subnet layout, firewall management, rack capacity planning
-
Kubernetes in production (k8s, k3s, or OpenShift): cluster architecture, RBAC, NetworkPolicy, PV provisioning (Ceph, Longhorn, NFS), Ingress (NGINX, Traefik), resource quotas
-
GitOps delivery with ArgoCD or Flux: Helm/Kustomize across dev/staging/prod, drift detection, and rollback
-
CI/CD pipeline design at scale with GitLab CI (or Jenkins/GitHub Actions): pipeline topology, runner fleet management, Kaniko/DinD builds, artefact signing, SBOM generation, secret management (Vault, CI variables)
-
Infrastructure-as-Code: Terraform or OpenTofu for provisioning, Ansible or Salt for config management, Bash/Python for automation
-
Security architecture: LDAP/AD with SSO (SAML/OIDC, Keycloak), MFA, VPN/zero-trust (WireGuard, Tailscale, Cloudflare Access), TLS management, LUKS, vulnerability scanning (Trivy, Grype, OpenVAS), patch workflows
-
Observability stack experience: metrics, log aggregation, alerting, and infrastructure monitoring
-
Storage architecture: NAS/SAN (TrueNAS, NetApp), object storage (MinIO/S3-compatible), tiered hot/warm/cold strategies, backup/recovery (Restic, Borg, Velero) for multi-terabyte datasets
-
Container expertise: multi-stage Docker builds, layer caching, private registry operation (Harbor, GitLab Registry), image scanning, runtime security
-
Ability to translate engineering needs into infrastructure designs, trade-off analyses, and capacity roadmaps
-
Clear technical documentation: architecture diagrams (draw.io, Mermaid), runbooks, operational procedures, post-incident reviews
-
Hardware procurement and vendor management experience (servers, storage, networking, UPS)
-
Comfortable in security-sensitive or defence-adjacent environments (access controls, audits, need-to-know policies)
-
Collaborative mindset – infrastructure as a service to engineering teams
-
English: Upper Intermediate or higher.
Will be a plus
-
MLOps infrastructure: MLflow/Weights&Biases for experiment tracking, model registries, training orchestration (Kubeflow, Ray, SLURM), dataset versioning (DVC, LakeFS), model serving (Triton, Torch Serve), GPU operator and MIG partitioning
-
Embedded and real-time systems CI: cross-compilation toolchains, hardware-in-the-loop (HIL) test integration, firmware signing and OTA update pipelines
-
Edge deployment infrastructure: OTA update systems, lightweight container runtimes, remote management and telemetry collection from deployed UAVs
-
Data engineering support: pipeline orchestration (Airflow, Prefect), data lake architecture, time-series and telemetry storage (InfluxDB, TimescaleDB), BI and analytics tooling (Grafana, Metabase)
-
Regulated or certified environments: ISO 27001, SOC 2, DO-178C awareness; experience with audit trails, change management, and evidence collection
-
Experience at a defence, aerospace, or deep-tech hardware company
-
Familiarity with Luxembourg's regulatory and data-sovereignty landscape
-
Additional European language (French, German, or Luxembourgish)
What We Offer
-
Office-based work in Luxembourg (5 days per week).
-
Relocation assistance.
-
26 days of paid vacation.
-
Medical insurance and sick leave covered by the Luxembourg national healthcare system.
-
Clear work-life balance policy with no overtime culture.