Are you a Solution Architect who's designed systems with LLM components in production - not just added an "AI" box to a diagram? We're expanding our AI engineering practice for enterprise clients in healthcare, media, and other regulated industries. We need an architect who can design the whole system: how the agent fits into the client's infrastructure, what breaks, what it costs, and why one approach is better than another for that specific client.

We're a ~300-person digital agency serving globally recognized enterprise clients. AI is becoming a component of every system we design. We need the person who makes the architectural decisions.

This is an open call of interest rather than an active vacancy. We are proactively building our AI pipeline and would love to review your background for future roles. If your profile aligns with our vision, we will be in touch when the right project kicks off.

If this sounds like you, we'd love to hear from you.

What is it all about?

Design end-to-end architectures for agent systems and LLM-powered applications: from user input to production deployment, including all integrations, data flows, and compliance layers.
Make and defend technology choices: which LLM provider, which vector database, which retrieval strategy, which agent pattern, which cloud services — and why, for this specific client and use case.
Design RAG architectures: chunking strategies, embedding model selection, vector database design (pgvector, Pinecone, Weaviate), hybrid search, reranking — choosing the right approach based on corpus size, update frequency, latency requirements, and regulatory constraints.
Design agent architectures: single-agent, multi-agent orchestrator, pipeline, human-in-the-loop, hybrid agent + deterministic workflows. Know when each is appropriate and when an agent is the wrong answer entirely.
Design integration architectures: how agent systems connect securely with client enterprise infrastructure — ERP, CRM, ITSM, document management, identity systems. OAuth, API gateways, mTLS, data isolation.
Design security and compliance architectures: PII handling, prompt injection prevention, audit logging, data residency, EU AI Act risk classification, GDPR, SDAIA requirements.
Design observability and evaluation architectures: what gets traced, how quality is measured, how cost is attributed, how drift is detected.
Estimate effort and infrastructure costs for AI engagements. Understand what's different about estimating AI projects vs classical software (higher variance, non-deterministic behaviour, eval overhead).
Lead technical conversations with client CTOs and Heads of Engineering. Present architectures, defend decisions, answer hard questions under pressure.
Write the technical sections of proposals: approach, architecture, timeline, team composition, risks, and infrastructure requirements.
Collaborate with AI Engineers on implementation feasibility, with AI Ops Engineers on operational design, and with Use-Case Consultants on technical qualification of new opportunities.
Build and maintain Q's standard architectural assets: reference architectures, decision frameworks, estimation templates, and security checklists.
Design hybrid deployment architectures across hyperscaler AI services (Bedrock, Azure OpenAI, Vertex), client infrastructure, and Q's private H200 GPU cluster. Decide where each workload runs based on cost, latency, data sensitivity, and sovereignty.
Design LLM cost architectures: prompt caching, model tiering and routing, batch vs real-time, token budgets. Defend TCO to client finance stakeholders.
Design prompt and context architectures: system prompt composition, structured output contracts, prompt versioning as deployable artifacts, context budgeting across multi-step workflows.
Design evaluation architectures for non-deterministic systems: golden datasets, LLM-as-judge, regression testing, drift detection — at CI and in production.
Select and defend models and frameworks: closed vs open-source LLMs, agent frameworks (LangGraph, Anthropic agent patterns, OpenAI Agents SDK), orchestration (n8n vs code-native), inference servers (vLLM, TGI).

What do we expect?

7+ years in software engineering, with 3+ in a Solution Architect or equivalent technical leadership role.
Experience designing distributed systems, API architectures, and enterprise integrations at scale.
Hands-on experience designing at least one LLM-powered system that went to production — not just advising on one.
Understanding of LLM internals at architectural level: tokenisation, context window economics, provider cost/latency/capability differences (Anthropic, OpenAI, Google, open-source via vLLM), and their implications for system design.
Understanding of RAG architecture: retrieval strategies, embedding models, vector database trade-offs, chunking approaches. You don't need to implement — you need to make the right design decision and explain why.
Understanding of agent architectural patterns: when to use agents vs deterministic workflows, state management, checkpointing, tool use design.
Experience with at least one cloud platform at an architectural level (AWS, Azure, or GCP), including AI/ML services (Bedrock, Azure OpenAI, Vertex AI).
Ability to design model routing strategies: local vs hosted, small fine-tuned vs frontier, provider failover.
Understanding of LLM cost optimisation as architectural concern: caching, routing, batching, context budgeting.
Familiarity with prompt registries and eval harnesses as architectural concerns.
Experience with agent frameworks at design level (LangGraph, Anthropic agent patterns, OpenAI Agents SDK) - enough to choose between them.
Experience in estimating and scoping complex technical projects. Bonus if you've estimated AI-specific work and seen where estimates broke.
Ability to read and review code in Python and at least one other language. You don't write production code — but you must be able to evaluate it.
Experience leading technical pre-sales: presenting to client technical leadership, writing technical proposals, and defending architectural decisions.
Understanding of security architecture for AI systems: OWASP LLM Top 10, data isolation, PII handling, audit trails.
Familiarity with self-hosted LLM deployment (vLLM, TGI), inference optimisation, and GPU memory/batching trade-offs at architectural level is a strong plus.
Familiarity with EU AI Act, GDPR, and regulatory compliance as architectural constraints is a strong plus.
Experience in healthcare, media, energy, or government/public sector verticals is a strong plus.
Familiarity with LLM observability architecture (Langfuse, LangSmith, OpenTelemetry) is a plus.
Familiarity with MCP (Model Context Protocol) and agent interoperability standards is a plus.
Cloud architecture certification (AWS SA Professional, Azure AZ-305, GCP Cloud Architect) is a plus.
Cloud AI certification (Azure AI-102, AWS AI Practitioner, GCP ML Engineer) is a plus.
Excellent communication skills — you can explain an architectural decision to a CTO and a junior developer with equal clarity.
Advanced English language skills.

What we offer:

The location choice is yours: remote, on-site or hybrid
Flexible working hours
Work with new technologies in a high-performance environment
Access to enterprise-grade infrastructure for architectural prototyping, including Q's on-premise NVIDIA H200 GPU cluster.
Diverse international projects
IT community involvement — Meetups, Workshops & Articles
Internal workshops & personal development
100% paid sick leave
Paid health insurance
Subvention of Multisport card
Transport allowance & meal allowance

Salary range:

Our salaries are based on your experience, level of knowledge & technical interview.

We will contact only the candidates who might be the best fit for future opportunities.

Sounds exciting? Click on the button below and apply now :)

Senior AI Solution Architect

What we offer:

Salary range:

Zagreb

About Q Ltd.

Senior AI Solution Architect

Senior AI Solution Architect

What we offer:

Salary range:

Open positions

Zagreb

About Q Ltd.

Senior AI Solution Architect