2026
-
SAGESERVE: Optimizing LLM Serving on Cloud Data Centers with Forecast Aware Auto-Scaling
-
Agentix: An Efficient Serving Engine for LLM Agents as General Programs
-
Queue Management for SLO-Oriented Large Language Model Serving
-
HIKU: Pull-Based Scheduling for Serverless Computing
-
Hermes: Enhancing Layer-7 Cloud Load Balancers with Userspace-Directed I/O Event Notification
-
Tiny Autoscalers for Tiny Workloads: Dynamic CPU Allocation for Serverless Functions
-
Llumnix: Dynamic Scheduling for Large Language Model Serving
-
Efficient LLM Scheduling by Learning to Rank
2025
-
LLM-Based Misconfiguration Detection for AWS Serverless Computing
-
ML-FaaS: Toward Exploiting the Serverless Paradigm to Facilitate Machine Learning Functions as a Service
-
Critical Limitations of the Least Outstanding Request Load Balancing policy in Service Meshes for Large-Scale Microservice Applications
-
Heet: Accelerating Elastic Training in Heterogeneous Deep Learning Clusters
-
Hourglass: Enabling Efficient Split Federated Learning with Data Parallelism
-
Following the Data, Not the Function: Rethinking Function Orchestration in Serverless Computing
-
Optimizing Serverless Performance Through Game Theory and Efficient Resource Scheduling
-
Agentic AI시대의 효율적인 GPU 자원 활용법
-
Layered mixed-precision training: A new training method for large-scale AI models
-
DMRO: A Deep Meta Reinforcement Learning-Based Task Offloading Framework for Edge-Cloud Computing
-
Topology-Aware Scheduling Framework for Microservice Applications in Cloud
-
SHOWAR: Right-Sizing And Efficient Scheduling of Microservices
-
Characterization of Large Language Model Development in the Datacenter
-
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
-
Rapid Deployment of DNNs for Edge Computing via Structured Pruning at Initialization
-
AQUATOPE: QoS-and-Uncertainty-Aware Resource Management for Multi-stage Serverless Workflows
-
EP4DDL: addressing straggler problem in heterogeneous distributed deep learning
-
Computation-Efficient Offloading and Power Control for MEC in IoT Networks by Meta-Reinforcement Learning
2024
-
QoS-awareness of Microservices with Excessive Loads via Inter-Datacenter Scheduling
-
Nodens: Enabling Resource Efficient and Fast QoS Recovery of Dynamic Microservice Applications in Datacenters
-
Adaptive and Scalable Caching with Erasure Codes in Distributed Cloud-Edge Storage Systems
-
COPA: A Combined Autoscaling Method for Kubernetes
-
CHRONICA: A Data-Imbalance-Aware Scheduler for Distributed Deep Learning
-
MCOTM: Mobility-aware computation offloading and task migration for edge computing in industrial IoT
-
Petrel: Heterogeneity-Aware Distributed Deep Learning Via Hybrid Synchronization
-
Deep Meta Q-Learning Based Multi-Task Offloading in Edge-Cloud Systems
-
Blender: A Container Placement Strategy by Leveraging Zipf-Like Distribution Within Containerized Data Centers
-
On Optimizing Traffic Scheduling for Multi-replica Containerized Microservices
-
Cost-Effective Data Placement in Edge Storage Systems with Erasure Code
-
Self-adaptive autoscaling algorithm for SLA-sensitive applications running on the Kubernetes clusters
-
DYVERSE: DYnamic VERtical Scaling in multi-tenant Edge environments
-
Sync-Switch: Hybrid Parameter Synchronization for Distributed Deep Learning
-
Lifting the Fog of Uncertainties: Dynamic Resource Orchestration for the Containerized Cloud
-
EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUs
-
MR-DRO: A Fast and Efficient Task Offloading Algorithm in Heterogeneous Edge/Cloud Computing Environments
-
NetMARKS: Network Metrics-Aware Kubernetes Scheduler Powered by Service Mesh
2023
-
Popularity-Based Data Placement with Load Balancing in Edge Computing
-
Predictive Hybrid Autoscaling for Containerized Applications
-
Delay Aware Container Scheduling in Kubernetess
-
Communication-Aware Container Placement and Reassignment in Large-Scale Internet Data Centers
-
Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning
-
Mu: An Efficient, Fair and Responsive Serverless Framework for Resource-Constrained Edge Clouds
-
MIRAGE: A consolidation aware migration avoidance genetic job scheduling algorithm for virtualized data centers
-
Jingwei: An Efficient and Adaptable Data Migration Strategy for Deduplicated Storage Systems
-
Predictive Autoscaling of Microservices Hosted in Fog Microdata Center
-
Prediction-Based Power Oversubscription in Cloud Platforms
-
TiFL: A Tier-based Federated Learning System
-
Community-based Placement of Registries to Speed up Application Deployment on Edge Computing
-
Towards cost-aware VM migration to maximize the profit in federated clouds
-
QoS Aware FaaS for Heterogeneous Edge-Cloud Continuum
-
Adaptive AI-based auto-scaling for Kubernetes
-
MIRAS: Model-based Reinforcement Learning for Microservice Resource Allocation over Scientific Workflows
-
Pufferfish: Container-driven Elastic Memory Management for Data-intensive Applications
-
KungFu: Making Training in Distributed Machine Learning Adaptive
-
KneeScale: Efficient Resource Scaling for Serverless Computing at the Edge
-
A new container scheduling algorithm based on multi-objective optimization
-
DeepScaling: microservices autoscaling for stable CPU utilization in large scale cloud systems
-
Balancing Efficiency and Fairness in Heterogeneous GPU Clusters for Deep Learning
-
A task scheduling algorithm considering game theory designed for energy management in cloud computing
-
Disaggregated Cloud Memory with Elastic Block Management
-
Exploring Potential for Non-Disruptive Vertical Auto Scaling and Resource Estimation in Kubernetes
2022
-
Resource Elasticity In Distributed Deep Learning
-
IoT Implementation of Kalman Filter to Improve Accuracy of Air Quality Monitoring and Prediction
-
Firecracker: Lightweight Virtualization for Serverless Applications
-
vSMT-IO: Improving I/O Performance and Efficiency on SMT Processors in Virtualized Clouds
-
A two-stage container management in the cloud for optimizing the load balancing and migration cost
-
Stochastic Gradient Push for Distributed Deep Learning
-
Measurement Noise Recommendation for Efficient Kalman Filtering over a Large Amount of Sensor Data
-
Segcache: a memory-efficient and scalable in-memory key-value cache for small objects
-
Mitigating Cold Start Problem in Serverless Computing with Function Fusion
-
Paratick: Reducing Timer Overhead in Virtual Machines
-
MEAD: Model-Based Vertical Auto-Scaling for Data Stream Processing
-
Asynchronous Decentralized Parallel Stochastic Gradient Descent
-
FaaSNet: Scalable and Fast Provisioning of Custom Serverless Container Runtimes at Alibaba Cloud Function Compute
-
Faasm: Lightweight Isolation for Efficient Stateful Serverless Computing
-
Optimal Application Deployment in Mobile Edge Computing Environment
-
FlexVF: Adaptive network device services in a virtualized environment
-
Multi-objective Container Deployment on Heterogeneous Clusters
-
Resource Management of Maritime Edge Nodes for Collected Data Feedback
-
Xanadu: Mitigating cascading cold starts in serverless function chain deployments
-
Prague: High-Performance Heterogeneity-Aware Asynchronous Decentralized Training
-
Modeling of Deep Neural Network (DNN) Placement and Inference in Edge Computing
-
Container lifecycle-aware scheduling for serverless computing
-
S2H: Hypervisor as a setter within Virtualized Network I/O for VM isolation on cloud
-
Algorithms for Scheduling Scientific Workflows on Serverless Architecture
-
PS2: Parameter Server on Spark
-
VM Performance Maximization and PM Load Balancing Virtual Machine Placement in Cloud
-
Virtualization-Aware Traffic Control for Soft Real-Time Network Traffic on Xen
-
Skedulix: Hybrid Cloud Scheduling for Cost-Efficient Execution of Serverless Applications
2021
-
Concurrent Container Scheduling on Heterogeneous Clusters with Multi-Resource Constraints
-
Tiresias: A GPU Cluster Manager for Distributed Deep Learning
-
LaSS: Running Latency Sensitive Serverless Computations at the Edge
-
Shared-Memory Communication for Containerized Workflows
-
Machine Learning for Load Balancing in Cloud Datacenters
-
KubeShare: A Framework to Manage GPUs as First-Class and Shared Resources in Container Cloud
-
Predicting the End-to-End Tail Latency of Containerized Microservices in the Cloud
-
Performance Optimization for Edge-Cloud Serverless Platforms via Dynamic Task Placement
-
A-SARSA: A Predictive Container Auto-Scaling Algorithm Based on Reinforcement Learning
-
Mitigating excessive vCPU spinning in VM-agnostic KVM
-
CRAM: a Container Resource Allocation Mechanism for Big Data Streaming Applications
-
Online scheduling of heterogeneous distributed machine learning jobs
-
FIRM: An Intelligent Fine-Grained Resource Management Framework for SLO-Oriented Microservices
-
Accelerated serverless computing based on GPU virtualization
-
Draco: Architectural and Operating System Support for System Call Security
-
Parallelizing Machine Learning as a service for the end-user
-
Optimus: An Efficient Dynamic Resource Scheduler for Deep Learning Clusters
-
ES2: Building an Efficient and Responsive Event Path for I/O Virtualization
-
Ownership: A Distributed Futures System for Fine-Grained Tasks
-
AutoSync: Learning to Synchronize for Data-Parallel Distributed Deep Learning
-
Efficient Fault Tolerance through Dynamic Node Replacement
-
RLSK: A Job Scheduler for Federated Kubernetes Clusters based on Reinforcement Learning
-
DRAGON: A Dynamic Scheduling and Scaling Controller for Managing Distributed Deep Learning Jobs in Kubernetes Cluster
-
Fine-grained Autoscaling with In-VM Containers and VM Introspection
-
Remote regions: a simple abstraction for remote memory
-
Diminishing Returns and Deep Learning for Adaptive CPU Resource Allocation of Containers
-
Costless: Optimizing Cost of Serverless Computing through Function Fusion and Placement
-
Horizontal and Vertical Scaling of Container-based Applications using Reinforcement Learning
-
MARBLE: A Multi-GPU Aware Job Scheduler for Deep Learning on HPC Systems
-
Effectively Mitigting I/O Inactiveiy in vCPU Scheduling
2020
-
Robustness-oriented k Edge Server Placement
-
A NSGA-II-based Approach for Multi-objective Micro-service Allocation in Container-based Clouds
-
Adaptive AI-based auto-scaling for Kubernetes
-
DeepPerf Performance Prediction for Configurable Software with Deep Sparse Neural Network
-
An efficient virtual cpu scheduling in cloud computing
-
Performance Characterization of DNN Training
-
Throughput-Oriented GPU Memory Allocation
-
On Uncoordinated Service Placement in Edge-Clouds
-
Outsourced Proofs of Retrievability
2019
-
Optimizing Validation Phase of Hyperledger Fabric
-
Delay-Aware Accident Detection and Response System Using Fog Computing
-
Data transmission plan adaptation complementing strategic time-network selection for connected vehicles
-
vCPU as a Container: Towards Accurate CPU Allocation for VM
-
Design and Implementation on Hyperledger-Based Emission Trading System
-
Machine Learning for Performance Prediction of Spark Cloud Applcations
-
GMOD - A Dynamic GPU Memory Overflow Detector
-
An Energy-Efficient and Deadline-Aware Task Offloading Strategy Based on Channel Constraint for Mobile Cloud Workflows
-
Data-Driven Serverless Functions for Object Storage
-
An empirical study on real-time data analytics for connected cars : Sensor-based applications for smart cars
-
vSimilar: A high-adaptive VM scheduler based on the CPU pool mechanism
-
A Performance Prediction Framework for Irregular Applications
-
Using Criticality of GPU Accesses in Memory Management for CPU-GPU Heterogeneous Multi-Core Processors
-
A deadline-constrained multi-objective task scheduling algorithm in Mobile Cloud environments
-
SAND: Towards High-Performance Serverless Computing
-
Performance Benchmarking and Optimizing Hyperledger Fabric Blockchain Platform
-
Autonomous data driven surveillance and rectification system using in-vehicle sensors for intelligent transportation systems (ITS)
-
Improving Spark Application Throughput Via Memory Aware Task Co-location:A Mixture of Experts Approach
-
TerrierTail: Mitigating Tail Latency of Cloud Virtual Machines
-
MASK - Redesigning the GPU Memory Hierarchy to Support Multi-Application Concurrency
-
A scheduling algorithm for autonomous driving tasks on mobile edge computing servers
-
SOCK: Rapid Task Provisioning with Serverless-Optimized Containers
-
Hyperledger Fabric: A Distributed Operating System for Permissioned Blockchains
-
Recovery for overloaded mobile edge computing
-
Dynamic Control of CPU Usage in a Lambda Platform
-
Riffle Optimized Shuffle Service for Large-Scale Data Analytics
-
A Virtual Multi-Channel GPU Fair Scheduling Method for Virtual Machines
2018
-
DMPO: Dynamic mobility-aware partial offloading in mobile edge computing
-
Container-Based Cloud Platform for Mobile Computation Offloading
-
An accurate resource scheduling system for virtual machines based on CPU load monitoring and assessment
-
Smart VM co-scheduling with the precise prediction of performance characteristics
-
Re-optimizing Data-Parallel Computing
-
Efficient Sharing and Fine-Grained Scheduling of Virtualized GPU Resources
-
Enhancing performance of IoT applications with load prediction and cloud elasticity
-
Large-scale cluster management at Google with Borg
-
Deploying High Throughput Scientific Workflows on Container Schedulers with Makeflow and Mesos
-
Learning-Based Memory Allocation Optimization for Delay-Sensitive Big Data Processing
-
Efficient Service Handoff Across Edge Servers via Docker Container Migration
-
A load-aware resource allocation and task scheduling for the emerging cloudlet system
-
Tableau
-
I/O Congestion-Aware Computing Resource Assignment and Scheduling in Virtualized Cloud Environments
-
FairGV: Fair and Fast GPU Virtualization
-
Resource-aware virtual machine migration in IoT cloud
-
RTVirt: Enabling Time-sensitive Computing on Virtualized Systems through Cross-layer CPU Scheduling
-
ppXen: A hypervisor CPU scheduler for mitigating performance variability in virtualized clouds
-
TensorFlow A system for large-scale machine learning
-
Doris, An Adaptive Soft Real-Time Scheduler in Virtualized Environments
-
EAERS: An Enhanced Version of Autonomic and Elastic Resource Scheduling Framework for Cloud Applications
-
Raccoon
-
SLA-Based Resource Scheduling for Big Data Analytics as a Service in Cloud Computing Environements
-
Efficient consolidation-aware VCPU scheduling on multicore virtualization platform
-
Cloud Resource Management With Turnaround Time Driven Auto-Scaling
2017
-
Matrix Computations and Optimization in Apache Spark
-
A framework to address inconstant user requirements in cloud SLAs management
-
CtrlCloud: Performance-Aware Adaptive Control for Shared Resources in Clouds
-
Representing Job Scheduling for Volunteer Grid Environment using Online Container Stowage
-
A Reliable Volunteer Computing System with Credibility-based Voting
-
Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics
-
LoGA: Low-overhead GPU accounting using events
-
Distributed volunteer computing for solving ensemble learning problems
-
CloudTalk: Enabling Distributed Application Optimisations in Public Clouds
-
Application-specific quantum for multi-core platform scheduler
-
Use of Reactive and Proactive Elasticity to Adjust Resources Provisioning in the Cloud Provider
-
KeystoneML Optimizing Pipelines for Large Scale Advanced Analytics
-
Boosting gLite with cloud augmented volunteer computing
-
Decentralised workflow scheduling in volunteer computing systems
-
CHOPPER: Optimizing Data Partitioning for In-Memory Data Analytics Frameworks
-
A Cloud Gaming System Based on User-Level Virtualization and Its Resource Scheduling
-
A lightweight plug-and-play elasticity service for self-organizing resource provisioning on parallel applications
-
Analysis, Modeling, and Simulation of Hadoop YARN MapReduce
-
vScale: Automatic and Efficient Processor Scaling for SMP Virtual Machines
-
Spark versus Flink Understanding Performance in Big Data Analytics Frameworks
-
A user mode CPU–GPU scheduling framework for hybrid workloads
-
Resource-aware hybrid scheduling algorithm in heterogeneous distributed computing
-
FemtoClouds: Leveraging Mobile Devices to Provide Cloud Service at the Edge
-
vProbe: Scheduling Virtual Machines on NUMA Systems
-
Rich feature hierarchies for accurate object detection and semantic segmentation
2016
-
Mobile Big Data Analytics Using Deep Learning and Apache Spark
-
u-root. A Go-based. Firmware Embeddable Root File System with On-Demand Compilation
-
PriDyn - Enabling Differentiated IO Services in Cloud Using Dynamic Priorities
-
AutoElastic: Automatic Resource Elasticity for High Performance Applications in the Cloud
-
Service Level and Performance Aware Dynamic Resource Allocation in Overbooked Data Centers
-
Controlling the deployment of virtual machines on clusters and clouds for scientific computing in CBRAIN
-
Faster Jobs in Distributed Data Processing using Multi-Task Learning
-
A Preemptive Execution System for GPGPU Computing
-
Toward Locality-aware Scheduling for Containerized Cloud Services
-
Prediction mechanisms for monitoring state of cloud resources using Markov chain model
-
Improved auto control ant colony optimization using lazy ant approach for grid scheduling problem
-
Automating Model Search for Large Scale Machine Learning
-
gScale - Scaling up GPU Virtualization with Dynamic Sharing of Graphics Memory Space
-
MVAPICH2 over OpenStack with SR-IOV An Efficient Approach to Build HPC Clouds
-
Contiguity and Locality in Backfilling Scheduling
-
Service Clustering for Autonomic Clouds Using Random Forest
-
A Hybrid Approach To Processing Big Data Graphs on Memory-Restricted Systems
-
Fast and reliable restoration method of virtual resources on OpenStack
-
A Priority-Based Scheduling Heuristic to Maximize Parallelism of Ready Tasks for DAG Applications
-
GPUswap - Enabling Oversubscription of GPU Memory through Transparent Swapping
-
DocLite- A Docker-Based Lightweight Cloud Benchmarking Tool
-
Scheduling parallel jobs with tentative runs and consolidation in the cloud
-
Optimization of virtual resource management for cloud applications to cope with traffic burst
-
GPUvm: Why Not Virtualizing GPUs at the Hypervisor
-
Resource preprocessing and optimal task scheduling in cloud computing environments
-
REST Service Framework for Fine-Grained Resource Management in Container-Based Cloud
-
Analysis of the Impact of CPU Virtualization on Parallel Applications in Xen
-
Dynamic Scheduling and Pricing in Wireless Cloud Computing
-
Activity recogntion on streaming sensor data
2015
-
Boosting GPU Virtualization Performance with Hybrid Shadow Page Tables
-
Optimizing Soft Real-Time Scheduling Performance for Virtual Machines with SRT-Xen
-
A solution of dynamic VMs placement problem for energy consumption optimization based on evolutionary game theory
-
aDock: A Cloud Infrastructure Experimentation Environment based on OpenStack and Docker
-
An Adaptive IO Prefetching Approach for Virtualized Data Centers
-
OpenMP task scheduling strategies for multicore NUMA systems
-
Job Scheduling for Cloud Computing Integrated with Wireless Sensor Network
-
Tool support for automated workload model creation from web server logs
-
IO Paravirtualization at the Device File Boundary
-
Real-Time Multi-Core Virtual Machine Scheduling in Xen
-
A Decentralized Multi-agent Approach to Job Scheduling in Cloud Environment
-
Performance-to-Power Ratio Aware Virtual Machine (VM) Allocation in Energy-Efficient Clouds
-
Automated Synthesis and Deployment of Cloud Applications
-
Poris - A Scheduler for Parallel Soft Real-Time Applications in Virtualized Environments
-
v Slicer: latency-aware virtual machine scheduling via differentiated-frequency CPU slicing
-
Towards Efficient Work-stealing in Virtualized Environments
-
A design space for dynamic service level agreements in OpenStack
-
A survey on resource allocation in high performance distributed computing systems
-
Improving the Time of Live Migration Virtual Machine by Optimized Algorithm Scheduler Credit
-
iTune: Engineering the Performance of Xen Hypervisor via Autonomous
-
Towards Fair and Efficient SMP Virtual Machine Scheduling
-
Dynamic Virtual Machine Migration Algorithms Using Enhanced Energy Consumption Model for Green Cloud Data Centers
-
VUPIC - Virtual Machine Usage Based Placement in IaaS Cloud
-
Using Imbalance Metrics to Optimize Task Clustering in Scientific Workflow Execution
-
Workload-Aware Credit Scheduler for Improving Network IO Performance in Virtualization Environment
2014
-
A Full GPU Virtualization Solution with Mediated Pass-Through
-
Energy-efficiency enhanced virtual machine scheduling policy for mixed workloads in cloud environments
-
Scheduling overcommitted VM Behavior monitoring and dynamic switching frequency scaling
-
Energy-credit scheduler; An energy-aware virtual machine scheduler for cloud systems
-
Workflow Clustering Method Based on Process Similarity
-
VGRIS: virtualized GPU resource isolation and scheduling in cloud gaming
-
Prioritizing Local Inter-Domain Communication in Xen
-
Scheduling para-virtualized virtual machines based on events
-
Energy-aware Load Balancing Policies for the Cloud Ecosystem
-
Optimized Hypervisor Scheduler for Parallel Discrete Event Simulations on Virtual Machine Platforms
-
A Survey on Cloud Computing
-
Optimation of Xen scheduler for multitasking
-
An Improved Xen Credit Scheduler for IO Latency-Sensitive Applications on Multicores
-
RT-Xen Towards Real-time Hypervisor Scheduling in Xen
-
A multi-objective ant colony system algorithm for virtual machine placement in cloud computing
-
Scientific Workflow Applications on Amazon EC2
-
Order-Preserving Renaming in Synchronous Systems with Byzantine Faults
-
Applicability of GPGPU Computing to Real Time AI Solutions in Games
-
KMA: A Dynamic Memory Manager for OpenCL
-
A Virtualized Separation Kernel for Mixed Criticality Systems
-
Server consolidation with migration control for virtualized data centers
-
The Weighted Byzantine Agreement Problem
-
Exploiting Joint WifiBluetooth Trace to Predict People Movement
-
VOCL - An Optimized Environment for Transparent Virtualization of Graphics Processing Units
-
A Multi-Objective Optimization Scheduling Method Based on the Ant Colony Algorithm in Cloud Computing