AI Inference Junior Engineer (WFH) – Full-Time Remote Job 2026 | Qubrid AI Hiring

AI Inference Junior Engineer (WFH) – Full-Time Remote Job 2026 | Qubrid AI Hiring Looking for an advanced remote AI engineering opportunity? Qubrid AI

AI Inference Junior Engineer (WFH) – Full-Time Remote Job 2026 | Qubrid AI Hiring

Looking for an advanced remote AI engineering opportunity? Qubrid AI is actively hiring an AI Inference Junior Engineer for its growing AI infrastructure team. This is a full-time work-from-home role designed for candidates with at least 2 years of experience in AI/ML infrastructure, LLM deployment, GPU optimization, and cloud-native systems.

This role offers the opportunity to work on cutting-edge AI infrastructure, optimize large-scale inference systems, and contribute to next-generation AI deployment platforms.

Job Overview

Job Role AI Inference Junior Engineer
Company Qubrid AI
Job Type Full-Time
Work Mode Remote / Work From Home
Experience Required Minimum 2 Years
Salary Competitive (Based on Experience)
Shift Timing Late Night (Till 4 AM IST for USA overlap)
Openings 1 Position

About Qubrid AI

Qubrid AI is building the next generation of AI infrastructure platforms that enable organizations to deploy, scale, and monetize AI workloads across cloud, on-premises, and hybrid environments. Their unified AI stack combines GPU cloud infrastructure, inference APIs, model deployment, RAG pipelines, fine-tuning, and orchestration tools.

Role Overview

As an AI Inference Junior Engineer, you will work at the intersection of machine learning, distributed systems, GPU optimization, and cloud infrastructure. You will deploy, optimize, and scale AI models to support enterprise-grade AI workloads with low latency and high throughput.

Key Responsibilities

  • Deploy and manage LLMs, multimodal, vision, speech, and embedding models.
  • Build optimized inference pipelines for enterprise AI applications.
  • Implement scalable model serving architectures.
  • Optimize GPU memory allocation, utilization, and throughput.
  • Perform model quantization using FP16, BF16, INT8, GPTQ, AWQ, and GGUF.
  • Work with NVIDIA H100, H200, A100, L40S, and advanced GPU platforms.
  • Build scalable inference clusters using Kubernetes.
  • Implement auto-scaling and fault-tolerant AI infrastructure.
  • Optimize multi-tenant AI serving environments.
  • Benchmark models for performance, cost, and accuracy.
  • Develop APIs and backend systems for AI services.
  • Collaborate with platform engineering teams for reliability improvements.

Required Qualifications

  • Bachelor’s or Master’s degree in Computer Science, AI/ML, Engineering, or related fields.
  • Minimum 2 years of software engineering experience.
  • Minimum 2 years of AI/ML production infrastructure experience.
  • Strong Python programming skills.
  • Deep understanding of Transformer models and LLM architectures.
  • Experience deploying Llama, DeepSeek, Qwen, Mistral, Gemma, and similar models.
  • Strong Linux system administration knowledge.
  • Hands-on experience with Docker and Kubernetes.
  • Knowledge of distributed systems and cloud-native architecture.

Technical Skills Required

AI & Machine Learning

  • PyTorch
  • Hugging Face Transformers
  • Model Quantization
  • Fine-tuning Workflows
  • Embedding Models
  • RAG Architectures
  • Vector Databases

Inference Frameworks

  • vLLM
  • NVIDIA TensorRT-LLM
  • Triton Inference Server
  • SGLang
  • TGI (Text Generation Inference)
  • Ollama
  • Ray Serve
  • NVIDIA Dynamo

GPU & Infrastructure

  • NVIDIA CUDA
  • TensorRT
  • NCCL
  • NVLink
  • NVSwitch
  • Multi-GPU Optimization
  • GPU Profiling

Cloud & DevOps

  • Kubernetes
  • Docker
  • Terraform
  • CI/CD Pipelines
  • AWS, Azure, GCP

Preferred Qualifications

  • Experience building AI APIs like OpenAI, Anthropic, Together AI, or DeepInfra.
  • Experience managing large-scale inference clusters.
  • Knowledge of GPU virtualization and multi-tenancy.
  • Distributed training and fine-tuning experience.
  • Familiarity with NVIDIA DGX and HGX systems.
  • Contributions to open-source AI infrastructure projects.

Why Join Qubrid AI?

  • Work on cutting-edge AI infrastructure.
  • Gain hands-on experience with advanced NVIDIA GPU platforms.
  • Build scalable AI systems for global enterprise customers.
  • Solve complex AI inference and optimization challenges.
  • Contribute to the future of AI infrastructure globally.

Important Note Before Applying

This is a full-time role. Candidates planning to manage multiple jobs simultaneously may be auto-rejected.

This role requires working late-night shifts in India (up to 4 AM IST) to overlap with US working hours.

Who Should Apply?

  • AI/ML engineers with infrastructure experience.
  • Candidates experienced in LLM deployment and optimization.
  • Professionals skilled in GPU performance tuning.
  • Developers interested in large-scale AI systems.
  • Cloud-native AI infrastructure engineers.

Final Thoughts

The Qubrid AI AI Inference Junior Engineer Recruitment 2026 is a strong opportunity for experienced AI engineers who want to work on advanced inference systems, GPU optimization, and enterprise AI deployment. If you have the right technical background and can work in US-aligned hours, this role can significantly accelerate your AI infrastructure career.

SEO Keywords: AI Inference Engineer Jobs 2026, Remote AI Jobs India, Work From Home AI Engineer, LLM Engineer Jobs, GPU Optimization Jobs, NVIDIA AI Jobs, Qubrid AI Careers, AI Infrastructure Engineer.

```

COMMENTS

Latest Articles

    Loaded All Posts Not found any posts VIEW ALL Readmore Reply Cancel reply Delete By Home PAGES POSTS View All RECOMMENDED FOR YOU LABEL ARCHIVE SEARCH ALL POSTS Not found any post match with your request Back Home Sunday Monday Tuesday Wednesday Thursday Friday Saturday Sun Mon Tue Wed Thu Fri Sat January February March April May June July August September October November December Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec just now 1 minute ago $$1$$ minutes ago 1 hour ago $$1$$ hours ago Yesterday $$1$$ days ago $$1$$ weeks ago more than 5 weeks ago Followers Follow THIS PREMIUM CONTENT IS LOCKED STEP 1: Share to a social network STEP 2: Click the link on your social network Copy All Code Select All Code All codes were copied to your clipboard Can not copy the codes / texts, please press [CTRL]+[C] (or CMD+C with Mac) to copy Table of Content