EigenLayer·2 weeks ago
Explore jobs (nav: jobs, pillar: tags) tagged ai-inference that surface AI inference engineer, ML engineer (inference), and MLOps positions focused on model optimization, quantization, ONNX and TensorRT pipelines, edge and real-time deployment, and low-latency production systems. Use the filtering UI to narrow results by location, experience level, remote/onsite, industry, and tech stack (PyTorch, TensorFlow, ONNX Runtime, TensorRT, TFLite), then save searches or set alerts to be notified of new openings. Each listing highlights required skills and common interview topics — profiling and benchmarking, latency SLOs, batching strategies, pruning and mixed-precision/INT8 quantization — and describes deployment patterns for edge devices (NVIDIA Jetson, Coral, ARM), cloud GPUs, and serverless inference. Review actionable insights and links to best practices for testing, CI/CD, and monitoring inference pipelines, and apply to roles that drive scalable, low-latency ML inference in production.
EigenLayer·2 weeks ago