AI Inference Junior Engineer (WFH) – Full-Time Remote Job 2026 | Qubrid AI Hiring

0 0 RABI KR. PANDIT Thursday, June 18, 2026 Edit this post

AI Inference Junior Engineer (WFH) – Full-Time Remote Job 2026 | Qubrid AI Hiring

Looking for an advanced remote AI engineering opportunity? Qubrid AI is actively hiring an AI Inference Junior Engineer for its growing AI infrastructure team. This is a full-time work-from-home role designed for candidates with at least 2 years of experience in AI/ML infrastructure, LLM deployment, GPU optimization, and cloud-native systems.

This role offers the opportunity to work on cutting-edge AI infrastructure, optimize large-scale inference systems, and contribute to next-generation AI deployment platforms.

Apply Now

Job Overview

Job Role	AI Inference Junior Engineer
Company	Qubrid AI
Job Type	Full-Time
Work Mode	Remote / Work From Home
Experience Required	Minimum 2 Years
Salary	Competitive (Based on Experience)
Shift Timing	Late Night (Till 4 AM IST for USA overlap)
Openings	1 Position

About Qubrid AI

Qubrid AI is building the next generation of AI infrastructure platforms that enable organizations to deploy, scale, and monetize AI workloads across cloud, on-premises, and hybrid environments. Their unified AI stack combines GPU cloud infrastructure, inference APIs, model deployment, RAG pipelines, fine-tuning, and orchestration tools.

Role Overview

As an AI Inference Junior Engineer, you will work at the intersection of machine learning, distributed systems, GPU optimization, and cloud infrastructure. You will deploy, optimize, and scale AI models to support enterprise-grade AI workloads with low latency and high throughput.

Key Responsibilities

Deploy and manage LLMs, multimodal, vision, speech, and embedding models.
Build optimized inference pipelines for enterprise AI applications.
Implement scalable model serving architectures.
Optimize GPU memory allocation, utilization, and throughput.
Perform model quantization using FP16, BF16, INT8, GPTQ, AWQ, and GGUF.
Work with NVIDIA H100, H200, A100, L40S, and advanced GPU platforms.
Build scalable inference clusters using Kubernetes.
Implement auto-scaling and fault-tolerant AI infrastructure.
Optimize multi-tenant AI serving environments.
Benchmark models for performance, cost, and accuracy.
Develop APIs and backend systems for AI services.
Collaborate with platform engineering teams for reliability improvements.

Required Qualifications

Bachelor’s or Master’s degree in Computer Science, AI/ML, Engineering, or related fields.
Minimum 2 years of software engineering experience.
Minimum 2 years of AI/ML production infrastructure experience.
Strong Python programming skills.
Deep understanding of Transformer models and LLM architectures.
Experience deploying Llama, DeepSeek, Qwen, Mistral, Gemma, and similar models.
Strong Linux system administration knowledge.
Hands-on experience with Docker and Kubernetes.
Knowledge of distributed systems and cloud-native architecture.

Technical Skills Required

AI & Machine Learning

PyTorch
Hugging Face Transformers
Model Quantization
Fine-tuning Workflows
Embedding Models
RAG Architectures
Vector Databases

Inference Frameworks

vLLM
NVIDIA TensorRT-LLM
Triton Inference Server
SGLang
TGI (Text Generation Inference)
Ollama
Ray Serve
NVIDIA Dynamo

GPU & Infrastructure

NVIDIA CUDA
TensorRT
NCCL
NVLink
NVSwitch
Multi-GPU Optimization
GPU Profiling

Cloud & DevOps

Kubernetes
Docker
Terraform
CI/CD Pipelines
AWS, Azure, GCP

Preferred Qualifications

Experience building AI APIs like OpenAI, Anthropic, Together AI, or DeepInfra.
Experience managing large-scale inference clusters.
Knowledge of GPU virtualization and multi-tenancy.
Distributed training and fine-tuning experience.
Familiarity with NVIDIA DGX and HGX systems.
Contributions to open-source AI infrastructure projects.

Why Join Qubrid AI?

Work on cutting-edge AI infrastructure.
Gain hands-on experience with advanced NVIDIA GPU platforms.
Build scalable AI systems for global enterprise customers.
Solve complex AI inference and optimization challenges.
Contribute to the future of AI infrastructure globally.

Important Note Before Applying

This is a full-time role. Candidates planning to manage multiple jobs simultaneously may be auto-rejected.

This role requires working late-night shifts in India (up to 4 AM IST) to overlap with US working hours.

Who Should Apply?

AI/ML engineers with infrastructure experience.
Candidates experienced in LLM deployment and optimization.
Professionals skilled in GPU performance tuning.
Developers interested in large-scale AI systems.
Cloud-native AI infrastructure engineers.

Final Thoughts

The Qubrid AI AI Inference Junior Engineer Recruitment 2026 is a strong opportunity for experienced AI engineers who want to work on advanced inference systems, GPU optimization, and enterprise AI deployment. If you have the right technical background and can work in US-aligned hours, this role can significantly accelerate your AI infrastructure career.

SEO Keywords: AI Inference Engineer Jobs 2026, Remote AI Jobs India, Work From Home AI Engineer, LLM Engineer Jobs, GPU Optimization Jobs, NVIDIA AI Jobs, Qubrid AI Careers, AI Infrastructure Engineer.

```

Labels:

online jobs 1

AUTHOR: RABI KR. PANDIT

Hello, I am Rabi Kr. Pandit, the founder of this platform and a resident of Kolkata, India. I hold a degree in History, Political Science, and English, and I am currently pursuing my legal studies at Jogesh Chandra Chaudhary Law College under the University of Calcutta. I manage multiple educational blogs with the aim of delivering knowledge in a clear, structured, and accessible manner. My focus is on simplifying complex topics and presenting reliable information that benefits students and readers from diverse academic backgrounds.

BARRISTERY - Jobs, Internships & Career

AI Inference Junior Engineer (WFH) – Full-Time Remote Job 2026 | Qubrid AI Hiring

AI Inference Junior Engineer (WFH) – Full-Time Remote Job 2026 | Qubrid AI Hiring

Job Overview

About Qubrid AI

Role Overview

Key Responsibilities

Required Qualifications