1402 lines
166 KiB
Plaintext
1402 lines
166 KiB
Plaintext
Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
Contents lists available at ScienceDirect
|
||
|
||
|
||
Computer Standards & Interfaces
|
||
journal homepage: www.elsevier.com/locate/csi
|
||
|
||
|
||
|
||
|
||
An autonomous deep reinforcement learning-based approach for memory
|
||
configuration in serverless computing
|
||
Zahra Shojaee Rad , Mostafa Ghobaei-Arani * , Reza Ahsan
|
||
Department of Computer Engineering, Qo.C., Islamic Azad University, Qom, Iran
|
||
|
||
|
||
|
||
|
||
A R T I C L E I N F O A B S T R A C T
|
||
|
||
Keywords: Serverless computing has become very popular in recent years due to its cost savings and flexibility. Serverless
|
||
Serverless computing computing is a cloud computing model that allows developers to create and deploy code without having to
|
||
Memory configuration manage the infrastructure. It has been embraced due to its scalability, cost savings, and ease of use. However,
|
||
Deep reinforcement learning
|
||
memory configuration is one of the important challenges in serverless computing due to the transient nature of
|
||
Autonomous computing
|
||
Function-as-a-service
|
||
serverless functions, which are stateless and ephemeral. In this paper, we propose an autonomous approach using
|
||
deep reinforcement learning and a reward mechanism for memory configuration called Auto Opt Mem. In the
|
||
Auto Opt Mem mechanism, the system learns to allocate memory resources to serverless functions in a way that
|
||
balances overall performance and minimizes wastage of resources. Finally, we validate the effectiveness of our
|
||
solution, the findings revealed that Auto Opt Mem mechanism enhances resource utilization, reduces operation
|
||
cost and latency, and improves quality of service (QoS). Our experiments demonstrate that Auto Opt Mem
|
||
mechanism achieves 16.8 % lower latency compared to static allocation, 11.8 % cost reduction, and 6.8 %
|
||
improve in QoS, resource utilization, and efficient memory allocation compared with base-line methods.
|
||
|
||
|
||
|
||
|
||
1. Introduction serverless functions in production use the default memory size, indi
|
||
cating that developers often overlook the importance of resource size
|
||
Serverless computing has emerged as an extended cloud computing [9]. Traditional memory configuration methods are often manual set
|
||
model that offers many advantages in flexibility, scalability and cost tings or static allocation, which may lead to inefficiencies such as
|
||
efficiency [1]. By separating the management of the underlying infra overprovisioning or underutilization. These inefficiencies can lead to
|
||
structure, developers can focus on writing and deploying code without increased costs or decreased performance and affect the effectiveness of
|
||
worrying about server provisioning or maintenance. There has been a lot the serverless function. By employing deep learning models, memory
|
||
of progress in various areas related to serverless computing [2]. One configuration can be automated, leading to an efficient solution. Deep
|
||
aspect is Function as a Service (FaaS) that increasingly been associated learning models analyze historical data to dynamically predict optimal
|
||
with a variety of applications, including video streaming platforms [3], memory settings, adapt to varying workloads, and minimize latency and
|
||
multimedia processing [4], Continuous Integration/Continuous cost. This approach uses the ability of deep learning to identify complex
|
||
Deployment (CI/CD) pipelines [5], Artificial Intelligence/Machine patterns and relationships in data, enabling more accurate and efficient
|
||
Learning (AI/ML) inference tasks [6], and query processing for Large resource management.
|
||
Language Models (LLMs) [7]. FaaS is a serverless cloud computing Recent research shows the importance of intelligent and autonomous
|
||
model that allows developers to run small, manageable services in iso systems in different domains. For instance, Arduino-based IoT automa
|
||
lated environments called function instances [8]. tion systems have shown how lightweight and adaptive architectures
|
||
Despite these advantages, memory configuration in serverless envi can improve efficiency and minimize manual intervention in con
|
||
ronments is a complex challenge due to the transient and stateless nature strained environments [10]. Likewise, autonomous AI frameworks for
|
||
of serverless functions. Choosing the right amount of memory or fraud detection in the Dark Web demonstrate the ability of self-learning
|
||
resource size is important and a challenge because it can result in faster mechanisms to adapt to dynamic and unpredictable conditions [11].
|
||
execution times and lower costs. A recent survey found that 47 % of These advances further motivate the need for AI-driven autonomic
|
||
|
||
|
||
|
||
* Corresponding author.
|
||
E-mail address: mo.ghobaei@iau.ac.ir (M. Ghobaei-Arani).
|
||
|
||
https://doi.org/10.1016/j.csi.2025.104098
|
||
Received 2 May 2025; Received in revised form 16 November 2025; Accepted 17 November 2025
|
||
Available online 19 November 2025
|
||
0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
solutions, such as our proposed Auto Opt Mem framework for serverless 1.4. Paper organization
|
||
memory configuration.
|
||
This paper is organized into several sections: Section 2 reviews
|
||
1.1. Research gap and motivation related memory configuration methods. Section 3 offers background
|
||
information. Section 4 presents a comprehensive explanation of the
|
||
Previous approaches are often static or limited to a specific platform. proposed solution. Section 5 assesses and discusses the experimental
|
||
The need for real-time adaptability to changing workloads is not met, results. Section 6 presents the discussion. Section 7 presents the con
|
||
and they have relied solely on statistical modeling and lacked the ability clusions with our findings and outlines future research directions.
|
||
to adapt in real time. The direct relationship between cost, latency, and
|
||
QoS has rarely been considered in a comprehensive framework. Our 2. Related works
|
||
work addresses these research gaps by introducing Auto Opt Mem based
|
||
on MAPE loop and DRL. This section discusses memory configuration approaches in server
|
||
The motivation for this study is that many functions still use default less computing. These approaches are categorized into three main
|
||
memory values, resulting in wasted resources, increased costs, and groups: machine learning-based approaches, heuristic-based ap
|
||
reduced quality of service (QoS). For example, 47 % of users rely on proaches, and framework-based approaches.
|
||
default memory settings, which emphasizes the importance of intelli
|
||
gent and adaptive optimization [9]. Addressing this gap is essential to 2.1. Machine learning-based approaches
|
||
improve performance and cost-effectiveness in serverless environments.
|
||
Simon Eismann et al. [17] have presented an approach called
|
||
"Sizeless" for predicting the optimal resource size for serverless functions
|
||
1.2. Our approach in cloud computing, based on monitoring data from a single memory
|
||
size. It highlights the challenges developers face in selecting resource
|
||
In this paper, we propose an autonomous memory configuration with sizes and shows that the method can achieve an average prediction error
|
||
deep learning model to predict memory setting based on a combination of 15.3 %, optimizing memory allocation for 79 % of functions, resulting
|
||
of the concept of the autonomic computing and the deep reinforcement in a 39.7 % speedup and a 2.6 % cost reduction. Anshul Jindal et al. [18]
|
||
learning (DRL) with the aim of increasing performance and cost- have presented a tool called FnCapacitor for estimating the Function
|
||
effectiveness. To realize autonomic computing, IBM has introduced a Capacity (FC) of Function-as-a-Service (FaaS) functions in serverless
|
||
reference framework for autonomic control loops known as the MAPE computing environments. It overcomes performance challenges due to
|
||
(Monitor, Analyze, Plan, Execute) loop [12,13]. This control MAPE loop system abstractions and dependencies between functions. Through load
|
||
resembles the general agent model put forth by Russell and Norvig [14], testing and modeling, FnCapacitor provides accurate FC predictions
|
||
where an intelligent agent observes its surroundings through sensors using statistical methods and deep learning, demonstrating effectiveness
|
||
and utilizes these observations to decide on actions to take within that on platforms like AWS Lambda and Google Cloud Functions. Djob
|
||
environment. The proposed approach follows the control MAPE loop, Mvondo et al. [19] have presented OFC, an in-memory caching system
|
||
which consists of four phases: monitoring (M), analysis (A), planning designed for Functions as a Service (FaaS) platforms to improve per
|
||
(P), and execution (E). First in the monitoring phase, the system ob formance by reducing latency during data access. It leverages machine
|
||
serves and collects the current state of the environment (memory in learning to predict memory requirements for function invocations, uti
|
||
serverless functions). In the analysis phase, the agent which is a deep lizing otherwise wasted memory from over-provisioning and idle sand
|
||
neural network analyzes the observed state and updates its policy based boxes. OFC demonstrates significant execution time improvements for
|
||
on it and the reward received. In the planning phase, the agent schedules both single-stage and pipelined functions, enhancing efficiency without
|
||
an action (i.e., memory configuration) based on the learned policy. In requiring changes to existing application code. Myung-Hyun Kim et al.
|
||
the execution phase, the scheduled action is applied to the environment. [20] have introduced ShmFaas, a serverless platform designed to
|
||
We utilize Deep Reinforcement Learning (DRL) [15,16] as a improve memory utilization for deep neural network (DNN) inference
|
||
decision-making tool that leverages the predicted outcomes from the by sharing models in-memory across containers. They address data
|
||
analysis phase to determine the best memory configuration during the duplication and cold start issues, particularly in resource-constrained
|
||
planning phase. Reinforcement Learning (RL) is a self-learning system edge cloud environments. Experimental results show that ShmFaas re
|
||
that enhances its effectiveness by continuously interacting with the duces memory usage by over 29.4 % compared to common systems,
|
||
cloud environment. while maintaining negligible latency overhead and enhancing
|
||
throughput. Siddharth Agarwal et al. [21] have presented MemFigLess,
|
||
1.3. Main contributions an input-aware memory allocation framework for serverless computing
|
||
functions, designed to resource usage and reduce costs. Using a
|
||
The main contributions of this research can be summarized as multi-output Random Forest Regression model, it correlates the input
|
||
follows: features of the function with memory requirements, leading to accurate
|
||
memory configuration. The evaluation shows that MemFigLess can
|
||
• We propose an autonomic method using deep reinforcement learning significantly decrease resource allocation and save on runtime costs.
|
||
method to predict memory configuration, and this method operates Finally, Table 1 shows a comparison of machine learning-based ap
|
||
based on a reward mechanism. proaches from related studies.
|
||
• We designed a multi-objective reward normalization mechanism that
|
||
simultaneously balances latency, cost, utilization, and QoS. 2.2. Heuristic-based approaches
|
||
• We integrated the MAPE-K control loop with deep reinforcement
|
||
learning (DRL) to enable closed-loop online adaptation. Goor-Safarian et al. [22], have present SLAM, a tool for optimizing
|
||
• Auto Opt Mem supports real-time continuous learning across varying memory settings for serverless applications consisting of multiple
|
||
workloads, which clearly differentiates it from static or offline ML- Function-as-a-Service (FaaS) functions. It addresses the issues in
|
||
based predictors. balancing cost and performance while meeting Service Level Objectives
|
||
• Experiments validate the effectiveness of the proposed method and (SLOs). By utilizing distributed tracing, SLAM estimates execution times
|
||
demonstrate performance improvements in metrics such as latency under various memory settings and identifies optimal configurations.
|
||
and cost. Robert Cordingly et al. [23] presented a method called CPU Time
|
||
|
||
2
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
Table 1
|
||
Comparison of machine learning-based approaches.
|
||
Ref Metric Method Advantage Disadvantage Tools
|
||
|
||
[17] • Execution time • Multi-target regression model • Reduces execution time • Limited to specific cloud • AWS Lambda
|
||
providers
|
||
• Resource consumption • Monitoring data • Decrease cost
|
||
• Performance overhead • Node.js
|
||
• Without requiring performance
|
||
test
|
||
[18] • Function Capacity (FC) • Statistical and deep learning • Accurate FC predictions • Limited to specific FaaS • Python
|
||
approaches platforms
|
||
• FnCapacitor
|
||
• Google Cloud Functions
|
||
(GCF)
|
||
• AWS Lambda
|
||
[19] • Function Capacity (FC) • OFC tool uses machine learning • Reduction in execution time Cost- • Overhead from cache • Python
|
||
effective management
|
||
• Utilizes idle memory • Transparent • Java
|
||
• OFC
|
||
• AWS
|
||
[20] • Efficiency of memory • OFC system shares DNN models • Reduces memory usage • Complexity • Python
|
||
usage
|
||
• Minimizes cold start delays • ShmFaas
|
||
• Minimal code changes • Kubernetes
|
||
[21] • Memory utilization • Uses input-aware Random Forest • Reduce memory allocation • Limited to specific platforms • AWS Lambda
|
||
Regression
|
||
• Reduce costs • Overhead from monitoring • Python
|
||
• AWS CloudWatch
|
||
|
||
|
||
|
||
Accounting Memory Selection (CPU-TAMS) to optimize memory con characteristics, such as its direct load/store access, can increase per
|
||
figurations for serverless Function-as-a-Service (FaaS) platforms. formance but also lead to bottlenecks when multiple threads concur
|
||
CPU-TAMS uses CPU time accounting and regression modeling methods rently write to it. They propose a PM-aware scheduling system for
|
||
to provide recommendations that reduce execution time and costs. serverless workloads that optimizes job completion time by managing
|
||
Tetiana Zubko et al. [24] presented MAFF (Memory Allocation Frame concurrent access and improving efficiency while ensuring fairness
|
||
work for FaaS functions), which is a framework to optimize memory among applications. Meenakshi Sethunath et al. [30] have proposed a
|
||
allocation for serverless functions automatically. MAFF adapts memory joint function warm-up and request routing scheme for serverless
|
||
settings based on function requirements and employs various algorithms computing that optimally utilizes both edge and cloud resources. It
|
||
to minimize costs and execution duration. The framework was tested on addresses like high latency and cold-start delays by maximizing the hit
|
||
AWS Lambda, demonstrating improved performance compared to ratio of requests. It reduced latency, by considering memory and budget
|
||
existing memory optimization tools. Josef Spillner [25] discussed constraints. Anisha Kumari, et al. [31] have proposed a performance
|
||
resource management for serverless functions, focusing on memory model for optimizing resource allocation in serverless applications,
|
||
tracking, profiling, and automatic tuning. The author outlines the issues addressing like cost estimation and performance evaluation. It in
|
||
that developers face in determining memory allocation due to troduces a greedy optimization algorithm to improve end-to-end
|
||
coarse-grained configurations from cloud providers. Also proposes tools response time while considering budget constraints. They utilize serv
|
||
to measure memory consumption and dynamically adjust allocations to erless applications on AWS to analyze the trade-offs between perfor
|
||
reduce waste and costs, and improve performance in mance and cost, demonstrating the model’s effectiveness in optimal
|
||
Function-as-a-Service (FaaS) environments. resource configurations. Finally, Table 2 shows a comparison of
|
||
Zengpeng Li et al. [26] have presented algorithms for optimizing heuristic-based approaches from related studies.
|
||
memory configuration in serverless workflow applications, specifically a
|
||
heuristic urgency-based algorithm (UWC) and a meta-heuristic hybrid 2.3. Framework-based approaches
|
||
algorithm (BPSO). These algorithms aim to balance execution time and
|
||
cost for serverless applications, to solve the challenges posed by memory Anjo Vahldiek-Oberwagner et al. [32] have proposed a Memory-Safe
|
||
allocation and performance modeling. Andrea Sabioni et al. [27] have Software and Hardware Architecture (MeSHwA) to enhance serverless
|
||
introduced a shared memory approach for function chaining on serv computing and microservices by leveraging memory-safe languages like
|
||
erless platforms and proposed a container-based architecture that in Rust and WebAssembly. It aims to reduce infrastructure overheads
|
||
creases the efficiency of function composition on the same host. By using associated with cloud architectures while improving performance and
|
||
a message-oriented middleware that operates over shared memory, this security through a unified runtime environment that isolates services
|
||
approach reduces response latency and improves resource utilization. effectively. Divyanshu Saxena, et al. [33] have presented Medes, a
|
||
The results show performance improvements in request completion serverless computing framework that improves performance and
|
||
rates and reduced latency during function execution. Aakanksha Saha resource efficiency by introducing a deduplicated sandbox state. This
|
||
et al. [28] have presented EMARS, an efficient resource management state reduces memory usage by removing redundant memory chunks
|
||
system designed for serverless cloud computing, focusing on optimizing across sandboxes, allowing for faster function startups and improved
|
||
memory allocation for containers. Built on the OpenLambda platform, management of warm and cold states. Experiments show that Medes
|
||
EMARS uses predictive models based on application workloads to adjust under memory pressure can reduce end-to-end latency and cold start. Ji
|
||
memory limits dynamically, enhancing resource utilization and Li, et al. [34] designed TETRIS, a memory-efficient serverless platform
|
||
reducing latency. Experiments demonstrate that tailored memory set for deep learning inference. It addresses the memory overconsumption
|
||
tings improve performance in serverless functions. Amit Samanta, et al. in serverless systems by implementing tensor sharing and runtime
|
||
[29] have discussed the issues and opportunities of integrating persis optimization. It reduces memory usage while increasing function den
|
||
tent memory (PM) into serverless computing. It shows how PM’s unique sity. TETRIS automates memory sharing and instance scheduling,
|
||
|
||
3
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
Table 2
|
||
Comparison of heuristic-based approaches.
|
||
Ref Metric Method Advantage Disadvantage Tools
|
||
|
||
[22] • Memory configuration effectiveness • Distributed tracing • Balances cost and • Limited to specific • Python
|
||
based on Service Level Objectives (SLOs) performance platforms
|
||
• Max-heap-based optimization algorithm • AWS
|
||
• SLAM
|
||
[23] • Efficiency • CPU time accounting • Reduces runtime • Limited to specific FaaS • Python
|
||
platforms
|
||
• Regression modeling • Reduces cost • AWS
|
||
• GCF
|
||
[24] • Effectiveness for FaaS functions • Algorithms Linear, Binary, Gradient • Lower cost • Require specific function • Python
|
||
Descent for self-adaptive memory profiling
|
||
optimization
|
||
• Faster execution • AWS
|
||
[25] • Efficiency • Utilizes memory tracing, profiling, and • Reduce cost • Requires extensive • AWS
|
||
autotuning tools profiling data
|
||
• Cost optimization • Improve • Functracer
|
||
performance
|
||
• Autoscaling • Autotuner
|
||
resource
|
||
• costcalculator
|
||
[26] • Efficiency • Heuristic (UWC) and meta-heuristic • Time-cost tradeoff • Complexity • AWS
|
||
(BPSO) algorithms
|
||
• Optimal workflow • Computational overhead • Python
|
||
• UWC
|
||
• BPSO
|
||
[27] • Response latency • Shared-memory, using a message-oriented • Improve request • Limited to co-located • Message-oriented
|
||
middleware completion functions middleware
|
||
• Resource usage • Improve rates • Complexity in managing
|
||
response time shared memory
|
||
• Optimized resource
|
||
usage
|
||
[28] • Memory allocation efficiency • Workload-based model • Optimizes resource • Complexity • Open Lambda
|
||
usage
|
||
• Latency • Memory-based model • Reduce latency • Python
|
||
[29] • Throughput • Using performance modeling and • Improve • Complexity • Open FaaS
|
||
admission control for concurrent access throughput
|
||
• Job completion time (JCT) • Reduce latency • Intel Optane
|
||
DCPMM
|
||
[30] • Latency reduction • Joint function warm-up • Reduces cold-start • Complexity • AWS
|
||
latency
|
||
• Request hit ratio • Routing for edge and cloud collaboration • Optimize • Dependent on accurate • Azure Function
|
||
performance profiling
|
||
[31] • End to end response time • Greedy optimization algorithm • Reduce latency • Complexity • Amazon Web
|
||
Service
|
||
• Cost • Reduce cost • Dependent on accurate • AWS
|
||
profiling
|
||
• handles cold start
|
||
delay
|
||
|
||
|
||
|
||
providing efficient resource utilization without compromising environments, for Function-as-a-Service (FaaS) models using microVMs.
|
||
performance. It addresses the issues of memory elasticity, during scaling down, by
|
||
Dmitrii Ustiugov, et al. [35] have discussed cold-start latency in segregating memory allocations for individual function instances,
|
||
serverless computing and introduces vHive, an open-source framework thereby enabling rapid and efficient reclamation without the overhead
|
||
for experimentation. It shows the inefficiencies of snapshot-based of page migrations. HotMem improves memory management perfor
|
||
function invocations, which can result in high execution times due to mance, maintaining low latency for function execution. Finally, Table 3
|
||
frequent page faults. They propose REAP, that prefetches memory pages shows a comparison of framework-based approaches from related
|
||
to reduce cold-start delays by 3.7 times, improving performance of studies.
|
||
serverless functions. Ao Wang, et al. [36] have presented INFINICACHE,
|
||
an innovative in-memory object caching system leveraging ephemeral 3. Background
|
||
serverless functions to provide a cost-effective solution for large object
|
||
caching in web applications. It shows the system’s ability to achieve cost In this section, we explain the concepts of serverless computing and
|
||
savings while maintaining high data availability and performance then provide explanations about memory configuration in serverless
|
||
through techniques like erasure coding and intelligent resource man computing.
|
||
agement. Anurag Khandelwal et al. [37] have presented Jiffy, an elastic
|
||
far-memory system for serverless analytics. It overcomes the limitations 3.1. Serverless computing
|
||
of existing memory allocation systems by enabling fine-grained,
|
||
block-level resource allocation, allowing jobs to meet their real-time Serverless computing enables developers to concentrate on coding
|
||
memory needs. Jiffy minimizes performance degradation and resource without the necessity of managing or provisioning servers, which is why
|
||
underutilization by dynamically managing memory for individual tasks. it’s termed "serverless." Serverless computing provides an efficient and
|
||
Orestis Lagkas Nikolos, et al. [38] have introduced HotMem, a mecha scalable solution for running programs. Its ease of management and
|
||
nism designed to enhance memory reclamation in serverless computing lightweight features have made it popular as an implementation model
|
||
|
||
4
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
Table 3
|
||
Comparison of framework-based approaches.
|
||
Ref Metric Method Advantage Disadvantage Tools
|
||
|
||
[32] • Performance and • MeSHwA, a memory-safe software and hardware • Increase security and • Complexity • Python
|
||
resource efficiency architecture performance
|
||
• Rust
|
||
• Resource sharing • Wasm
|
||
[33] • End-to-end latency • Medes, a framework utilizing memory deduplication • Reduce cold start • Complexity • AWS
|
||
to create a new deduplicated sandbox Lambda
|
||
• Memory usage • Increase flexibility • Overhead from deduplication • Python
|
||
|
||
• Optimal memory • Open Whisk
|
||
• Open FaaS
|
||
• Function
|
||
Bench
|
||
• CRIU
|
||
[34] • Function startup latency • Runtime sharing to reduce memory consumption • Memory savings • Complexity • Open FaaS
|
||
• Reduce cold start • Tensor management overhead • TensorFlow
|
||
[35] • Cold-start latency • REAP, a mechanism that records and prefetches a • Reduce cold-start
|
||
function’s working set latency
|
||
• Overhead the first • Python s
|
||
invocation
|
||
• Memory efficiency • Kubernete
|
||
[36] • Latency of function • INFINICACHE, caching through tensor sharing and • Cost saving • Limited to large object caching • AWS
|
||
invocation erasure coding Lambda
|
||
• Data availability • Relying on the limitations of
|
||
serverless architecture
|
||
[37] • Execution time • Jiffy, an elastic far-memory • Improvement execution • Complexity • Amazon EC2
|
||
time
|
||
• Memory utilization • Increase resource • Relies on specific serverless • Python
|
||
utilization architecture
|
||
• C++
|
||
• Java
|
||
[38] • Memory speed • HotMem, a memory management framework with • Faster reclaim memory • Challenges in memory • Open whisk
|
||
rapid reclamation of hotplugged memory management
|
||
• Tail latency • Azure
|
||
|
||
|
||
|
||
in cloud computing [39]. Serverless computing is an abstraction of cloud when there are fewer concurrent users. This elasticity transforms
|
||
computing infrastructure. Serverless computing is a cloud computing serverless computing models into a pay-as-value billing model [39].
|
||
model in which a cloud provider or a third-party vendor manages the • Efficiency and performance: Developers do not need to perform
|
||
company’s servers. The company does not need to purchase, install, complex tasks like multi-threading, HTTP requests, etc. FaaS forces
|
||
host, and manage the servers. Instead, the cloud administrator provides developers to focus on building the application rather than config
|
||
all of these services. uring it.
|
||
Serverless computing, also known as Function-as-a-Service (FaaS), • Programming languages: Serverless computing supports many pro
|
||
ensures that the code exposed to the developer consists of simple, event- gramming languages, including JavaScript, Java, Python, Go, C#,
|
||
driven functions. As a result, developers can focus more on writing code and Swift [44]. This versatility allows developers to choose the
|
||
and delivering innovative solutions, without the hassle of creating test language that best suits their project needs and expertise, increasing
|
||
environments and provisioning and managing servers for web-based productivity and enabling rapid application development.
|
||
applications [40]. FaaS and the term “serverless” can be used inter
|
||
changeably, with serverless computing being FaaS. This is because the Challenges of Serverless Computing:
|
||
FaaS platform automatically configures and maintains the context of the
|
||
functions and connects them to cloud services without the need for • Vendor lock-in: Vendors typically use proprietary technologies to
|
||
developers to provide a server [41,42]. enable their serverless services. This can create problems for users
|
||
Features of Serverless Computing: who want to migrate their workloads to another platform. When
|
||
migrating to another provider, changes to the code and application
|
||
• Pay-per-use: In serverless computing, users pay only for the time architecture are inevitable [45].
|
||
their code uses dedicated CPU and storage. Pay-per-use pricing • Cold start latency: Serverless services can experience a latency
|
||
models reduce costs. But in cloud services, users pay for over known as “cold start.” When the service is first started, it takes some
|
||
provisioning of resources like storage and CPU time, which often sit time for the service to respond. The reason for this is the initial
|
||
idle [39,43]. configuration of the cloud service provider and resource allocation
|
||
• Speed: Teams can move from idea to market more quickly because and initialization of the infrastructure. This initial delay can be a
|
||
they can focus entirely on coding, testing, and iterating without the concern in systems that respond to many requests per second.
|
||
operational costs and server management. There is no need to update Methods and techniques are important for mitigating the cold start
|
||
fundamental infrastructure like operating systems or other software problem [46–48].
|
||
patches, allowing teams to concentrate on building the best possible • Debugging complexity: Debugging serverless functions is difficult
|
||
features without worrying about the underlying infrastructure and due to their transient nature. The reason for this problem is that
|
||
resources. Serverless functions typically do not maintain the state of previous
|
||
• Scalability and elasticity: The cloud vendor is responsible for auto calls (stateless). Also, serverless functions are stateless by design,
|
||
matically scaling up capabilities and technologies to meet customer which can complicate application state management. Also, reports
|
||
demands. Serverless functions should automatically scale down on serverless function calls should be sent to the developer. These
|
||
|
||
|
||
5
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
reports should include detailed stack traces so that developers can 3.3. Deep reinforcement learning
|
||
identify the cause of an error. Stack tracing is currently not available
|
||
for serverless computing, meaning developers cannot easily identify Neural networks were combined with reinforcement learning for the
|
||
the cause of an error [49–51]. first time in 1991 when Gerald Tsauro used reinforcement learning to
|
||
• Architectural complexity: A serverless application may consist of train a neural network to play backgammon at the master level [54].
|
||
multiple functions. The more functions there are, the more complex Deep reinforcement learning (DRL) is a combination of reinforcement
|
||
the architecture becomes. This is because each function must be learning and deep neural networks. In reinforcement learning, an agent,
|
||
properly linked to other functions and services. Also, managing this while interacting with its environment learns a policy incrementally that
|
||
large number of functions and services can be difficult [52]. enables it to maximize long-term rewards. This approach enables the
|
||
• Long-running: Serverless computing with long-running functions learning of much more complicated policies that are suitable for
|
||
executes the function in a limited and short execution time, while high-dimensional problems by combining with deep neural networks.
|
||
some tasks may require a long execution time. These functions do not Fig. 1 shows the elements in a reinforcement learning model.
|
||
support long-running execution because these functions are stateless, Deep reinforcement learning is used in various fields such as com
|
||
meaning that if the function is stopped, it cannot be resumed [53]. puter games, robotics, resource management in distributed systems, and
|
||
performance optimization of cloud computing systems. Deep learning
|
||
3.2. Memory configuration techniques can be used in memory configuration in serverless
|
||
computing. And predicted the amount of memory required to run a
|
||
Memory configuration in serverless computing is an important serverless application.
|
||
aspect that influences application performance, resource efficiency, and Serverless computing is a computing model in which the cloud ser
|
||
cost management. Understanding how to optimally allocate memory for vice provider manages the infrastructure and server resources. A
|
||
serverless functions is important for developers who want to maximize developer just needs to write his code and upload it onto the serverless
|
||
the benefits of serverless computing [40]. In serverless environments, platform. The advantages of serverless computing include many things:
|
||
developers deploy small units of code, known as functions, which are automatic scalability, cost per resource, ease of management. Deep
|
||
executed in response to specific events. Each function operates in a learning can be used to improve memory configuration in serverless
|
||
stateless manner and is allocated a certain amount of memory at run computing. Deep learning can automatically identify memory usage
|
||
time. The memory configuration affects several factors: patterns in applications and allocate the required memory based on that.
|
||
Benefits of using deep learning for automatic memory configuration in
|
||
• Performance: The memory allocated for a function will determine serverless computing applications:
|
||
execution time. The more the memory, the faster, as it allows the
|
||
CPU to perform better and cold start latency to reduce. But too less • Automation: With deep learning, the technology can easily identify
|
||
memory can make it very slow and cause downtime. patterns concerning memory usage in an application and automate
|
||
• Cost efficiency: The pricing of serverless computing is pay-per-use; the memory needed for allocation. Consequently, this will reduce
|
||
the costs are determined by the amount of resources utilized in time spent on memory configuration.
|
||
execution. Memory configuration can help prevent over-allocation of • Optimization: Deep learning can learn how to optimize memory
|
||
memory, which leads to high costs. However, a lack of memory will usage by managing how memory is allocated. It helps to decrease
|
||
lead to performance issues that need to be balanced [40]. computing costs and enhances the performance of applications.
|
||
• Scalability: Memory configuration is performed in such a way that • Flexibility: Deep learning can learn from changes in memory usage
|
||
serverless applications will be able to scale seamlessly according to patterns. It can help in the improvement of an application over time.
|
||
variable workloads. Since the demand fluctuates, dynamically allo
|
||
cating memory helps achieve good performance without incurring Deep learning to automate the configuration of memory in serverless
|
||
excessive costs. computing faces some challenges. First, there is a need for training data.
|
||
Deep learning requires enough training data in order to identify patterns
|
||
Memory configuration optimization and automated, data-driven in memory usage. Another challenge is the complexity of deep learning,
|
||
approaches to memory management in serverless computing increase which also requires a lot of time to learn. However, using deep learning
|
||
performance and scalability, and help save costs. These methods can for automating the configuration of memory in serverless computing
|
||
allow developers to minimize the challenges of manual memory offers numerous benefits. It will improve memory configuration tech
|
||
configuration. niques and reduce computational costs.
|
||
|
||
|
||
|
||
|
||
Fig. 1. Elements in a reinforcement learning model.
|
||
|
||
6
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
3.4. Autonomic computing utilization of these functions. Auto Opt Mem utilizes Deep Reinforce
|
||
ment Learning (DRL), a branch of artificial intelligence, to learn an
|
||
The MAPE-K model provides a framework for the management of optimal memory configuration policy. DRL enables the system to learn
|
||
autonomous and self-adaptive systems [55–57]. The model comprises from experience and make decisions based on a reward mechanism [15,
|
||
five major components: Monitoring, Analysis, Planning, Execution, and 16]. In this regard, Auto Opt Mem learns the memory resource allocation
|
||
Knowledge Management, illustrated in Fig. 2. to SFs for maximum performance while minimizing waste of resources.
|
||
Monitoring is where data is gathered on an ongoing basis to measure By employing DRL, Auto Opt Mem takes into account resource
|
||
the system’s current state and functionality relative to predetermined constraints in the serverless environment, resource requirements of
|
||
goals. The subsequent Analysis stage examines this data to bring to light functions, energy consumption, latency, and deployment costs. The
|
||
any differences between the present condition and desired outcomes, system learns the optimal memory configuration policy through a
|
||
giving the information necessary for adjustment. After problem detec training process that involves interacting with the environment and
|
||
tion, the Planning stage creates schemes to amend these disparities, receiving feedback in the form of rewards. The automatic memory
|
||
specifying how the system needs to modify its behavior. The Execution configuration learning approach in Auto Opt Mem enables the system to
|
||
stage then enacts these strategies, modifying the system’s behavior to dynamically adapt to changing conditions and optimize memory
|
||
attain the desired results. Knowledge Management serves as a repository resource allocation based on the specific needs of each SF. This approach
|
||
of knowledge for the important information shared among the various contributes to efficient resource utilization and improved performance
|
||
stages. Overall, the MAPE-K model offers a feedback loop that allows in serverless computing environments. This work proposes a framework
|
||
autonomous systems to adapt to new situations to ensure optimum called Auto Opt Mem for automatic optimal memory that uses a deep
|
||
performance by constantly monitoring and adjusting. This model is learning agent to automatically optimize the memory allocated to each
|
||
required to build resilient and efficient autonomous systems. In addi serverless function with respect to computational resources. The pri
|
||
tion, autonomous control systems greatly improve the efficiency and mary goal of Auto Opt Mem, utilizing Deep Reinforcement Learning
|
||
reliability of industrial processes by minimizing human error and (DRL), is to optimize memory allocation for serverless functions. In
|
||
ensuring consistent performance. Autonomous control systems are other words, the main goal is to dynamically adjust the memory con
|
||
closed-loop controls that are preprogrammed to operate autonomously figurations to become highly performant, cost-effective, and efficient
|
||
without the intervention of an operator, and they ensure predictable regarding resource utilization. Indeed, it seeks to optimize memory for
|
||
results in complex industrial settings. Therefore, the integration of serverless functions to minimize costs and reduce latency while main
|
||
automated control systems and sophisticated monitoring techniques taining or improving Quality of Service (QoS).
|
||
makes operations simpler and more reliable, and industrial processes
|
||
become a necessity in contemporary manufacturing plants [58]. 4.1.1. Key components
|
||
This section describes the elements of the Auto Opt Mem framework
|
||
4. Proposed approach that are essential to the automating memory configuration process.
|
||
Environment
|
||
In this section, we will explain our proposed approach in more detail. The environment is the platform of serverless, such as AWS Lambda
|
||
First, we will introduce the framework that utilizes machine learning for or Google Cloud Functions where a set of functions are deployed and
|
||
memory configuration in serverless computing, followed by the pre executed within. It reflects resource usage information, performance
|
||
sentation of a formula and algorithm. Finally, we describe the autono metrics, and all other relevant status information.
|
||
mous memory configuration using deep learning to predict memory Agent
|
||
setting for serverless computing. The agent does the job of making decisions about the memory size to
|
||
be assigned to each function. It possesses one DRL model that tries to
|
||
4.1. Proposed framework learn through interactions with the environment.
|
||
State
|
||
In the proposed solution, Auto Opt Mem introduces an autonomous The state St at time t includes:
|
||
memory configuration approach based on learning for serverless
|
||
computing. The goal of Auto Opt Mem is to address the challenge of • Current memory allocation for each function.
|
||
efficiently distributing serverless functions (SF) in a serverless envi • Performance metrics (e.g., execution time, latency).
|
||
ronment while considering a number of various real-world parameters. • Number of requests received.
|
||
One of the key parameters to consider within Auto Opt Mem is memory • Cost metrics.
|
||
configuration, a term that shows how much of the available memory • Specific performance requirements.
|
||
resources are to be allocated for the serverless function. Memory
|
||
configuration has a direct impact on the performance and resource Action
|
||
The action At at time t is the choice of memory size from some pre
|
||
defined set of configurations for that particular function.
|
||
Reward
|
||
The reward Rt at time t is the signal that informs feedback and
|
||
therefore controls learning. It needs to be defined with the aim of:
|
||
|
||
• Encourage efficient memory usage.
|
||
• Improve performance metrics.
|
||
• Maintain or increase Quality of Service (QoS).
|
||
• Minimize operational costs.
|
||
|
||
Policy
|
||
Policy is a strategy by which the agent makes decisions to select
|
||
actions according to the current state. The DRL model implements the
|
||
policy.
|
||
Fig. 2. Autonomic computing (MAPE-K loop) [57]. Fig. 3 shows the iterative loop diagram in deep reinforcement
|
||
|
||
7
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
|
||
|
||
Fig. 3. Iterative loop diagram in deep reinforcement learning algorithms.
|
||
|
||
|
||
learning algorithms.
|
||
Table 4
|
||
List of Mathematical Symbols.
|
||
4.2. Problem formulation
|
||
Definition Notation
|
||
|
||
This section presents a mathematical model for the memory config A set of serverless functions Fi
|
||
uration problem in serverless computing. In the following, we describe Memory allocated to function F є fi Mi
|
||
Delay of function fi with memory Mi Li (Mi)
|
||
the used symbol in more details.
|
||
Cost of executing function fi with memory Mi Ci (Mi)
|
||
Memory usage for function fi with memory Mi Ui (Mi)
|
||
4.2.1. Notation Quality of service criterion for function fi with memory Mi Qi (Mi)
|
||
Reward at time t Rt
|
||
• F: Set of serverless functions State space S
|
||
State space at time t St
|
||
• Mi: Memory allocated to function F in fi Space of action A
|
||
• Li (Mi): Latency of function fi with memory Mi Action A at time t At
|
||
• Ci (Mi): Cost of executing function fi with memory Mi Current memory allocation for each function fi Mi,t
|
||
• Ui (Mi): Memory utilization for function fi with memory Mi Latency Li,t
|
||
Cost Ci,t
|
||
• Qi (Mi): Quality of Service metric for function fi with memory Mi
|
||
Quality of service Qi,t
|
||
• Rt: Reward at time t Weighting factor for delay α
|
||
Weighting factor for cost β
|
||
The list of mathematical symbols is summarized in Table 4. Weighting factor for usage (utilization) γ
|
||
The framework and automated approach for learning-based memory Weighting factor for service quality δ
|
||
Minimum memory size that can be allocated to a function Mi,min
|
||
configuration in serverless computing is as follows: The maximum amount of memory that can be allocated to a function Mi,max
|
||
State space (S) Minimum acceptable QoS for the function Fi Qmin
|
||
i
|
||
The state St at time t includes: Expected value E
|
||
Policy π
|
||
• Current memory allocation Mi,t for each function fi. Value function Vπ
|
||
Policy network parameters θ
|
||
• Performance metrics such as latency Li,t , cost Ci,t, and Quality of Value network parameters ϕ
|
||
Service Qi,t. Action-value function Qπ (s,a)
|
||
• The number of requests received.
|
||
|
||
Action space (A) learning, guiding the agent toward desired behaviors by associating
|
||
The action At at time t involves selecting the memory size Mi,t + 1 for rewards or penalties based on the outcome of actions. The reward
|
||
each function fi. function Rt is designed to:
|
||
Reward function (R)
|
||
The reward function is an essential component of reinforcement • Reduce latency and cost.
|
||
|
||
8
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
• Decrease memory utilization and increase QoS. ( )
|
||
Qi Mi,t ≥ Qmin
|
||
i for alli ∈ F (6)
|
||
Which is expressed by Eq. (1): Mi,min and Mi,max are the minimum and maximum memory sizes that
|
||
(
|
||
∑ (
|
||
) can be allocated to the function Fi.
|
||
( ) ( ) ( )) ( )
|
||
Rt = − i∈f
|
||
αLi Mi,t + βCi Mi,t − γU Mi,t + δQi Mi,t (1) The minimum acceptable QoS for the function Fi is. This ensures that
|
||
the QoS is not less than a certain threshold.
|
||
|
||
where (α, β, γ, δ) are the weighting factors for latency, cost, utilization, 4.2.3. Reinforcement learning formula
|
||
and QoS, respectively. That weights can adjust the relative importance This section explains the mathematical formulas used in the rein
|
||
of each component in the reward function. For example, if reducing forcement learning component of the Auto Opt Mem framework.
|
||
latency is more important than cost, α can be increased relative to β. Policy (π)
|
||
Lower latency is desirable; therefore, -αLi (Mi,t) penalizes higher The policy π(a|s) is the probability distribution over actions given the
|
||
latencies. current state. It is the strategy that the decision-maker considers for its
|
||
Lower costs are also preferable; thus, -βCi (Mi,t) penalizes higher next action in response to the current state.
|
||
expenses. Value function (Vπ)
|
||
Memory utilization is desired to be efficient. Over-allocation of Value function refers to the expected long-term discounted reward,
|
||
memory leads to resource waste, while under-allocation can degrade which contrasts with short-term rewards (R) as it focuses on the long
|
||
performance. γUi (Mi,t) rewards the agent for optimal memory usage. term. It represents the expected return in the long-term resulting from
|
||
Higher QoS is preferred; therefore, +δQi (Mi,t) rewards better service the current state s under policy π. The value function is an important
|
||
quality. concept in reinforcement learning and represents the expected return
|
||
To normalize the reward function for optimizing memory configu (cumulative reward) starting from state s and following policy π, and is
|
||
ration in serverless computing, all components (cost, latency, utiliza defined by Eq. (7):
|
||
tion, and QoS) must be on a common scale. Because they have different [ ]
|
||
units (for example, costs in dollars, latency in milliseconds). Below is an ∑∞
|
||
V π (S) = E γt Rt |S0 = s, π (7)
|
||
approach to achieve this normalization. We will use min-max normali t=0
|
||
zation for each component. The min-max normalization formula is as
|
||
follows, and is defined by Eq. (2): Where γ is the discount factor.
|
||
x − xmin
|
||
xʹ = (2) • Expected value (E): Due to the random nature of the environment, it
|
||
xmax − xmin
|
||
is the average or expected return in all possible future states.
|
||
∑
|
||
Where: • Sum of reward: ∞ t
|
||
t=0 γ Rt It represents the Sum of rewards over
|
||
time. Rewards are discounted by a factor of γt to prioritize immediate
|
||
• X is the original value, rewards over distant future rewards.
|
||
• Xmin is the minimum value for that metric, • Discount factor (γ): The discount factor γ (where 0 < γ < 1) de
|
||
• Xmax is the maximum value for that metric. termines the present value of future rewards. A higher γ makes future
|
||
rewards more significant, while a lower γ emphasizes immediate
|
||
After normalization, the metrics can be combined into the reward rewards.
|
||
function without unit conflict, and the final reward formula becomes, • Policy (π): The policy π is the decision rule that gives the probability
|
||
which is expressed by Eq. (3): of executing action a from a state s.
|
||
( ( ( ) ( ) ( ) ))
|
||
∑ Li Mi,t − Lmin Ci Mi,t − Cmin Ui Mi,t − Umin
|
||
Rt = − α +β − γ Value function is important since it is utilized to measure how good a
|
||
i∈f
|
||
Lmax − Lmin Cmax − Cmin Umax − Umin given state is under a given policy. It is also a basis for policy comparison
|
||
( ) and for determining optimal actions to maximize the long-term reward.
|
||
Qi Mi,t − Qmin
|
||
+δ
|
||
Qmax − Qmin Bellman equation
|
||
(3) The Bellman equation provides a recursive decomposition for the
|
||
value function, making it a powerful tool for solving reinforcement
|
||
Normalization, in effect, normalizes each component into the range learning problems. The Bellman equation for the value function is as
|
||
of [0, 1], thus making them comparable even if they are from different follows, and is defined by Eq. (8):
|
||
units. Normalizing the reward function components makes them com
|
||
parable and hence they can be combined meaningfully during the V π (S) = Ea− π [Rt + γV π (S + 1)|St = s, At = a] (8)
|
||
optimization process. This method increases the robustness of the model Reasons for using the Bellman equation:
|
||
and enhances convergence in reinforcement learning algorithms.
|
||
• Recursive nature: The Bellman equation breaks down the value of a
|
||
4.2.2. Optimization problem state into the immediate reward Rt plus the discounted value of the
|
||
First, it is necessary to mathematically formulate the problem in next state γVπ(St+1). The recursion makes computation efficient and
|
||
terms of optimization and reinforcement learning for the Auto Opt Mem forms the foundation for dynamic programming techniques.
|
||
algorithm and its solution. The problem of optimization can be done as • Policy evaluation: By repeatedly updating the value function using
|
||
follows, and is expressed by Eq. (4): the recursive relationship, it aids in evaluating the expected return of
|
||
T
|
||
∑ a policy π.
|
||
minM Rt (4) • Optimality principle: For finding the optimum policy, the Bellman
|
||
t=0 optimality equation expresses the relationship between the value of a
|
||
With the following constraints, and is expressed by Eq. (5): state and the values of subsequent states. This is used in algorithms
|
||
like value iteration and Q-learning for finding the optimal value
|
||
Mi,min ≤ Mi,t ≤ Mi,max for alli ∈ F (5) function.
|
||
Quality of Service constraints, and is expressed by Eq. (6): • Simplification of complexity: The use of a recursive approach al
|
||
lows the Bellman equation to simplify the calculation of the value
|
||
|
||
9
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
function for all states, which otherwise would be computationally monitoring the number of incoming requests for each function. The
|
||
infeasible in large state spaces. monitoring phase is important because it records real-time information
|
||
that indicates system performance and resource utilization. In short:
|
||
Policy gradient
|
||
The policy gradient method is used to optimize policy π, and is • Continuously observe the current state of St, including memory
|
||
defined by Eq. (9): allocation, performance metrics (latency Li,t , cost Ci,t , utilization
|
||
Ui,t, QoS Qi,t), and incoming request rate.
|
||
∇θ J(πθ ) = Es− dπ ,a− πθ [∇θ logπθ (a|s)Qπ (s, a)] (9)
|
||
• Collect data from serverless functions and the environment.
|
||
where θ are the parameters of the policy, and Qπ (s,a) is the action-value
|
||
function. 4.3.2.2. Analyze. In the analysis phase, the system uses the perfor
|
||
The value-action function represents the expected return of taking mance metrics collected during the monitoring phase, and the system
|
||
action a in state s and then following policy π, providing a basis for accepts the performance measurements collected during the monitoring
|
||
policy improvement. phase. This is achieved by examining the current performance of the
|
||
The Bellman equation and value function are the basis of studying serverless functions based on the collected data and calculating a reward
|
||
and quantifying long-term impact from these decisions and guide the value that controls the learning. The reward function is usually based on
|
||
policy to optimal function. latency, cost, utilization, and QoS, thus allowing for any inefficiencies or
|
||
issues that need to be addressed in subsequent phases. In short:
|
||
|
||
4.3. Proposed algorithm
|
||
• Evaluate performance metrics Li(Mi), Qi(Mi), Ci(Mi), Ui(Mi).
|
||
• Calculate reward Rt based on the current state.
|
||
The proposed Auto Opt Mem algorithm uses deep reinforcement
|
||
learning to learn how to assign serverless functions to compute resources
|
||
Which is expressed by Eq. (10):
|
||
efficiently. Incorporating the MAPE loop, the Auto Opt Mem algorithm
|
||
( ( ( ) ( ) ( ) ))
|
||
continuously monitors, analyzes, plans, and executes actions to optimize ∑ Li Mi,t − Lmin Ci Mi,t − Cmin Ui Mi,t − Umin
|
||
memory allocation in an automous and adaptive manner. The process of Rt = − α +β − γ
|
||
i∈f
|
||
Lmax − Lmin Cmax − Cmin Umax − Umin
|
||
the algorithm is as follows: ( )
|
||
Qi Mi,t − Qmin
|
||
+δ
|
||
4.3.1. Initialization Qmax − Qmin
|
||
This algorithm is the first step in the deep reinforcement learning (10)
|
||
(DRL) process for memory configuration. First, two main networks are
|
||
prepared: the Policy Network, which is responsible for choosing the 4.3.2.3. Plan. In the planning phase, the system decides the next ac
|
||
action (e.g., allocating the amount of memory) in each state. This tions based on the analysis observations. In this stage, best memory
|
||
network initially starts with random parameters because it has no allocation decisions are selected based on the policies learned from the
|
||
knowledge at the beginning and gradually learns to make optimal de deep reinforcement learning model and updates in policy network pa
|
||
cisions. The Value Network estimates the long-term value of a state rameters for enhanced future decision-making. This planning assists the
|
||
based on the sum of future rewards. This network also starts with system in adjusting its memory allocation policies effectively based on
|
||
random parameters. Then, the initial state of the environment (S0) is the current situation and performance analysis. It can be said that:
|
||
defined, which includes the memory configuration of each function,
|
||
performance indicators (latency, cost, QoS, and resource utilization), • Use policy network πθ to determine the optimal action At (memory
|
||
and the rate of incoming requests. This step is the basis of training and allocation for the next time step).
|
||
provides the basis for the agent’s interaction with the environment. • Update policy and value networks using reinforcement learning
|
||
This initialization is very important, as it sets the starting point for techniques. Typically involves:
|
||
training the networks to learn optimal memory allocation for serverless ○ Calculating the policy gradient using the policy gradient method.
|
||
|
||
functions, with the aim of minimizing cost and latency and maintaining ○ Updating the policy network parameters θ.
|
||
|
||
high QoS. This is shown in Algorithm 1. ○ Using the Bellman equation to update the network parameters ϕ.
|
||
|
||
|
||
|
||
|
||
4.3.2. MAPE loop 4.3.2.4. Execution. The execution phase is responsible for carrying out
|
||
In this section, we describe an autonomous memory configuration the actions that have been planned. It allocates memory to every func
|
||
includes four phases: monitoring, analysis, planning, and execution, tion according to the decisions of the planning phase and changes to the
|
||
next state, ready to trigger the MAPE cycle once more. Its role is to
|
||
4.3.2.1. Monitor. In the monitoring phase, the system monitors the realize the change from planning and analysis, optimize resource utili
|
||
current state of the environment at all times, and the system is always zation, and improve the system as a whole. In summary:
|
||
aware of the state of the environment. This includes monitoring the
|
||
memory allocated to each serverless function, obtaining performance • Apply the selected action to adjust the memory allocation for each
|
||
data such as latency, cost, utilization, and quality of service (QoS), and function.
|
||
• Go to the next state St+1 and repeat the loop.
|
||
Algorithm 1
|
||
Pseudo code for initialization phase ( ). The MAPE loop enables a dynamic and automated way of memory
|
||
1: Input: Set of serverless functions F
|
||
management in serverless computing systems, where the system can
|
||
2: Output: Initialized policy network πθ, value network Vϕ, and initial state S0 learn and change with new situations on a continuous basis, eventually
|
||
3: Initialize policy network πθ with random parameters θ resulting in enhanced performance and resource efficiency, as shown in
|
||
4: Initialize value network Vϕ with random parameters ϕ Algorithm 2.
|
||
5: Define initial system state S0 including:
|
||
6: - Memory allocation for each fi ∈ F
|
||
7: - Performance metrics: Latency Li(Mi), Cost Ci(Mi), Utilization Ui(Mi), QoS Qi(Mi) 4.3.3. Training loop
|
||
8: - Incoming request rate The training loop is an important part of the Auto Opt Mem algo
|
||
9: Return (πθ, Vϕ, S0) rithm, which uses deep reinforcement learning to improve how memory
|
||
|
||
|
||
10
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
is allocated for serverless functions. The process starts by resetting the Although AWS Lambda and CloudWatch were used conceptually to
|
||
environment to its initial state, called S0. For each time step t, the al structure the resource and metric model, all experiments were executed
|
||
gorithm goes through the MAPE loop. It begins by checking the current in a simulated environment. As shown in Table 5.
|
||
state St, gathering data about memory usage, performance metrics, and The experiments involved tuning a set of hyperparameters. The pa
|
||
other relevant information. After collecting this data, it analyzes it to see rameters were:
|
||
how well the system is performing and calculates a reward that helps Learning rate: Different learning rates like 0.01, 0.001 and 0.0001
|
||
guide future learning. Next, the algorithm plans the next action by were attempted during training.
|
||
choosing the best way to allocate memory using its policy network. This Discount Factor (γ): Discounting coefficient of 0.99 was chosen to
|
||
decision is based on the insights gained from the analysis. Once it de give higher importance to long-term rewards for the reinforcement
|
||
cides on an action, the system carries it out, adjusting the memory for learning.
|
||
each function as needed. Finally, the algorithm moves to the next state Batch size: A batch size of 32 was utilized in training the deep
|
||
St+1 and prepares to start the loop again. This ongoing process allows the learning models, a trade-off between training time and model accuracy.
|
||
system to learn and adapt continuously, refining its memory allocation Number of episodes: Training was carried out over 100 episodes,
|
||
strategies based on real-time feedback. Each cycle helps improve the where the agent learns through experience interacting with the envi
|
||
overall performance and efficiency of memory management in the ronment and tunes its memory allocation policies.
|
||
serverless environment. In summary, it can be said: Both the policy and value networks are implemented as multilayer
|
||
perceptrons (MLPs). The input layer receives the state vector, which
|
||
1. Environment reset: Reset the environment to its initial state S0. includes memory allocation, latency, cost, utilization, QoS, and request
|
||
2. MAPE execution: For each time step t: rate. Each network has two fully connected hidden layers with 128 and
|
||
• Monitor: Observe the current state St. 64 neurons, respectively, and ReLU activation functions. The policy
|
||
• Analyze: Evaluate performance and calculate reward. network ends with a softmax output layer producing a probability dis
|
||
• Plan: Select an action using the policy network. tribution over possible memory allocations, while the value network has
|
||
• Execute: Execute the action and move to the next state St+1. a single linear output estimating the state value. The generated experi
|
||
ence data was split into 70 % for training and 30 % for evaluation, which
|
||
Algorithm 3 shows the execution phase of the MAPE loop during the is standard in DRL-based optimization studies.
|
||
training process.
|
||
|
||
5. Performance evaluation 5.2. Performance metrics
|
||
|
||
In this section, we present the performance evaluation of the novel To evaluate the effectiveness of the proposed approach, we utilize
|
||
automatic deep learning-based approach (Auto Opt Mem) for memory several performance metrics, including:
|
||
setting in serverless computing. We describe the experimental setup, Latency: The time taken for a function to execute, measured in
|
||
performance metrics, and the result of the experiments. milliseconds. Lower latency indicates better performance. Which is
|
||
expressed by Eq. (11):
|
||
5.1. Experimental setup Tend − Tstart
|
||
L= (11)
|
||
N
|
||
We carried out experimental analysis in this study on a Windows
|
||
11–64-bit computer with an Intel Core i7 processor. The evaluation uses Where and are the Tstart and Tend times of execution, and N is the
|
||
a serverless simulation environment, where memory sizes between 128 number of function invocations.
|
||
MB and 2048 MB are modeled to study their impact on latency, cost, and Cost: The total cost incurred during function execution, measured in
|
||
quality of service (QoS). A virtual CPU (vCPU) model with burst US dollars. Our goal is to minimize this cost. Which is expressed by Eq.
|
||
behavior and dynamic workload scaling is incorporated into the simu (12):
|
||
lation. Python was utilized to implement deep reinforcement learning N
|
||
∑
|
||
algorithms in machine learning. Proximal Policy Optimization algo C= (Mi × Pmem + Ti × Pexec ) (12)
|
||
i=1
|
||
rithm was utilized to train the deep reinforcement learning agent as it is
|
||
known to be stable and efficient in policy optimization procedures. Also, Where Mi is the allocated memory, Pmem is the price per MB, Ti is the
|
||
the reason for choosing the Proximal Policy Optimization (PPO) algo execution time, and Pexec is the execution price per second.
|
||
rithm is that it is widely suitable for continuous control and policy Quality of Service (QoS): A composite score reflecting the reli
|
||
optimization problems in environments with large dynamic states such ability and user satisfaction of the service. A composite metric based on
|
||
as Serverless environments. And it has higher stability in training than latency and availability, defined as, and is expressed by Eq. (13):
|
||
algorithms such as REINFORCE or Vanilla Policy Gradient. It has the
|
||
1
|
||
ability to control the trade-off between exploration and exploitation in QoS = (13)
|
||
L + δ(1 − A)
|
||
dynamic environments where the workload changes randomly. It is
|
||
suitable and scalable in environments with high-dimensional state Where A is the availability factor (percentage of successful execu
|
||
spaces where multiple parameters such as latency, cost, utilization, and tion) and helps the quality of service to take into account the impact of
|
||
quality of service (QoS) need to be optimized simultaneously. The availability on the overall system performance. δ is a weighting
|
||
workloads used in our evaluation consisted of four modeled serverless parameter.
|
||
scenarios: ML inference, API aggregation, data preprocessing (ETL), and Utilization: The efficiency of memory usage during function
|
||
video processing. These workloads are designed to emulate the perfor execution, aiming for optimal allocation without wastage. Which is
|
||
mance behavior of serverless applications while being executed in a fully expressed by Eq. (14):
|
||
simulated environment. Furthermore, all experiments are implemented
|
||
Mused
|
||
in Python language, and the source code of simulation can be down U= × 100 (14)
|
||
Mallocated
|
||
loaded at the GitHub repository.1
|
||
Where Mused is the actual memory usage and Mallocated is the assigned
|
||
memory. A higher value indicates optimal management of memory
|
||
1
|
||
https://github.com/zahrashj-rad/Auto-Opt-Mem resources.
|
||
|
||
11
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
Algorithm 2 the model’s weights. We tested different learning rates to examine their
|
||
Pseudo code for MAPE loop phase ( ). impact on Auto Opt Mem’s efficiency.
|
||
1: Input: Current state St, networks (πθ, Vϕ) In this experiment, different learning rates were implemented for the
|
||
2: Output: Updated state St+1 problem of prediction and memory regulation in serverless environ
|
||
3: Monitor: ments, and a reinforcement learning agent based on policy gradient was
|
||
4: Collect metrics (Latency Li,t, Cost Ci,t, Utilization Ui,t, QoS Qi,t, Requests)
|
||
implemented. The goal was to show the impact of learning rate on
|
||
5: Analyze:
|
||
6: Compute reward Rt using the reward function memory allocation policy learning and ultimately on the performance
|
||
7: Rt = − (αLi(Mi,t) + βCi(Mi,t) − γUi(Mi,t)) + δQi(Mi,t) metrics Latency, Cost, QoS, and Utilization. Three learning rate values
|
||
8: Normalize all components (cost, latency, utilization, QoS) were tested: 0.01, 0.001, and 0.0001; each experiment was performed
|
||
9: using: on 100 episodes. The environment model and the relationships between
|
||
10: X’ = (X − Xmin) / (Xmax − Xmin)
|
||
memory and metrics were implemented by a noisy function to represent
|
||
11: Plan:
|
||
12: Select action At = πθ (St) the overall system behavior and the impact of memory selection on
|
||
13: Update policy network πθ using policy gradient metrics.
|
||
14: ∇θ J(πθ) = E₍s-dπ, a-πθ₎ [ ∇θ log πθ(a|s) Qπθ(s,a) ] The learning rate can affect the performance metrics as follows:
|
||
15: Update value network Vϕ using Bellman equation
|
||
Latency: A very high learning rate may destabilize training, pre
|
||
16: Vπ(S) = E a-π [ Rₜ + γ Vπ (S + 1) | Sₜ = s, Aₜ = a ]
|
||
17: Execute:
|
||
venting the model from converging to an optimal policy. This instability
|
||
18: Apply action At (update memory allocation Mi,t + 1) leads to fluctuations in memory allocation and, consequently, higher
|
||
19: Transition to next state St+1 execution latency. Conversely, a moderate learning rate supports stable
|
||
20: Return St+1 convergence, resulting in lower latency.
|
||
Cost: If memory allocation decisions are unstable due to an exces
|
||
sively high learning rate, resource usage can increase, raising the
|
||
Algorithm 3 execution cost. However, in some cases, a higher learning rate can
|
||
Pseudo code for MAPE Execution phase ( ). accelerate convergence to an efficient allocation, thus reducing costs.
|
||
1: Input: Environment, networks (πθ, Vϕ) The impact on cost is therefore dual, depending on whether training
|
||
2: Output: Optimized memory allocation policy stabilizes.
|
||
3: Initialize environment and obtain initial state S0 Quality of Service (QoS): High learning rates may cause instability
|
||
4: For each episode do
|
||
in memory allocation policies, leading to inconsistent performance and
|
||
5: For each time step t do
|
||
6: Run MAPE Loop with input St degraded QoS. In contrast, a well-tuned moderate learning rate achieves
|
||
7: Observe reward Rt and next state St+1 more stable optimization and improved QoS.
|
||
8: Store tuple (St, At, Rt, St+1) Utilization: Rapid but unstable adjustments caused by a high
|
||
9: Update πθ and Vϕ based on (St, At, Rt, St+1) learning rate can lead to inefficient memory allocations, resulting in
|
||
10: Policy network update
|
||
11: ∇θ J(πθ) = E₍s-dπ, a-πθ₎ [ ∇θ log πθ(a|s) Qπθ(s,a) ]
|
||
either under-utilization or over-utilization of resources. A balanced
|
||
12: Value network update learning rate is more likely to achieve efficient utilization. Fig. 4 shows
|
||
13: Vπ(S) = E a-π [ Rₜ + γ Vπ (S + 1) | Sₜ = s, Aₜ = a ] the learning rates and results.
|
||
14: Set St ← St+1 The cost and quality of service results obtained from training the
|
||
15: End For
|
||
agent with different learning rates showed:
|
||
16: End For
|
||
17: Return optimized policy πθ The intermediate learning rate LR = 0.001 shows the lowest latency
|
||
and highest QoS among the three values. In contrast, the cost increases
|
||
significantly and the utilization decreases. In fact, the agent with LR =
|
||
0.001 learns faster and more significantly policies that favor higher
|
||
Table 5
|
||
Tools and technologies.
|
||
memory (or policies that make allocations that reduce latency) this leads
|
||
to improved QoS but increases the cost per execution unit. The inter
|
||
Component Description
|
||
mediate rate allows the agent to accept weight changes strongly enough
|
||
Cloud Provider AWS Lambda (conceptual model), implemented as a to reach regions with lower latency (but may be cost-ineffective).
|
||
simulated environment in code. The larger learning rate LR = 0.01 shows very low cost and very high
|
||
Programming Language Python, for implementing the deep reinforcement
|
||
learning algorithms.
|
||
utilization, but latency and QoS remain at moderate levels. In fact, high
|
||
Deep Learning Framework TensorFlow/Keras, for building and training the deep learning rates usually make large updates; In this simulation, the agent
|
||
learning models. has arrived at a policy that keeps the cost low (e.g., choosing low or
|
||
Reinforcement Learning Proximal Policy Optimization (PPO) average memories) while maintaining high utilization. This could mean
|
||
Algorithm
|
||
learning a cost-saving policy; however, this policy may cause fluctua
|
||
Monitoring Tools Simulated monitoring module (MAPE-K), no real
|
||
CloudWatch data used. tions and not reach the optimal latency. It is also possible for the agent to
|
||
Data Management Pandas, for data manipulation and analysis of the get stuck in a local boundary (with low cost) under noisy gradients.
|
||
performance metrics. A smaller learning rate of LR = 0.0001 had intermediate results
|
||
Development Environment Jupyter Notebook / Google Colab for interactive (latency and QoS close to LR=0.01 but average cost). In fact, a too small
|
||
development and experimentation.
|
||
rate leads to slow and stable learning; the agent may not have fully
|
||
converged yet and not have seen significant improvement by the end.
|
||
5.3. Experimental results Table 6 shows impact of learning rate effects on performance
|
||
metrics.
|
||
We evaluated our proposed approach with baseline methods,
|
||
including machine learning-based approaches, the impact of learning 5.3.2. Second scenario: reward function formula based on MAPE loop
|
||
rate and reward function. To further analyze the impact of Auto Opt Mem, we conducted eight
|
||
experiments and calculated the reward function. As explained in the
|
||
5.3.1. First scenario: impact of learning rate on optimization Section 4, the reward function is calculated from the following formula.
|
||
One of the important hyperparameters in deep reinforcement
|
||
learning is the learning rate (LR), which controls the extent of updates to
|
||
|
||
|
||
12
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
( ( ( ) ( ) ( ) ))
|
||
∑ Li Mi,t − Lmin Ci Mi,t − Cmin Ui Mi,t − Umin We calculate the reward function for each experiment.
|
||
Rt = − α +β − γ Experiment 1:
|
||
i∈f
|
||
Lmax − Lmin Cmax − Cmin Umax − Umin
|
||
( )
|
||
( )
|
||
Qi Mi,t − Qmin 90 − 80 0.17 − 0.15 75 − 60 66 − 40
|
||
R1 = − + − +
|
||
+δ
|
||
Qmax − Qmin 100 − 80 0.50 − 0.15 80 − 60 70 − 40
|
||
|
||
Where α, β, γ, and δ are the weights for each variable. The negative at R1 = − (0.5 + 0.1333 − 0.75) + 0.8667 = − ( − 0.1167) = 0.1167
|
||
the beginning of the formula shows that we want to include the negative The reward function for the remaining experiments is calculated
|
||
impact of cost and latency in the reward function. Quality of service (Qi) similarly.
|
||
is considered positive and its positive impact is included in the reward These findings strengthen the ability of Auto Opt Mem to optimize
|
||
function. Adding up the metrics, we would like to consider the sum of execution time while reducing costs, and it has better performance. The
|
||
the influence of all functions. To calculate the reward function, the reward function is analyzed as follows.
|
||
minimum and maximum values are given below:
|
||
Lmin = 80, Lmax = 100 • Increasing QoS and Utilization have increased the reward, because
|
||
the system has better efficiency.
|
||
Cmin = 0.15, Cmax = 0.50 • Reducing latency and cost has a positive effect on the reward, which
|
||
indicates more optimal performance.
|
||
Umin = 60, Umax = 80 • The highest reward value is observed in Experiment 8 (0.91), which
|
||
indicates the optimal balance between QoS, Utilization, latency, and
|
||
Qmin = 40, Qmax = 70 cost.
|
||
• The lowest reward value is recorded in Experiment 6 (0.03), which is
|
||
The reward function for different data in the experiment is shown in
|
||
due to the increase in latency and the decrease in system efficiency
|
||
Table 7, (Latency in ms, Cost in USD, QoS %, Utilization %, Reward
|
||
(Utilization).
|
||
score).
|
||
|
||
|
||
|
||
|
||
Fig. 4. Impact of learning rate on performance metrics.
|
||
|
||
13
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
Table 6
|
||
Comparison of learning rate effects on performance metrics.
|
||
Learning Rate (LR) Latency Cost QoS Utilization
|
||
|
||
• 0.01 (high) • Relatively high, unstable, converges to moderate level • Lowest cost • Moderate QoS • Highest utilization
|
||
• 0.001 (medium) • Lowest latency (best) • Highest cost • Highest QoS • Reduced utilization
|
||
• 0.0001 (low) • Moderate, slow convergence • Moderate cost • Moderate QoS • Moderate utilization
|
||
|
||
|
||
|
||
Fig. 5 shows the reward function graph. X-axis: Experiment ID (1–8). but lacking real-time optimization capabilities. Auto Opt Mem in
|
||
Y-axis: Reward value (normalized) tegrates reinforcement learning to dynamically adjust memory alloca
|
||
Fig. 6 shows Comparison of reward function and other metrics in tions, ensuring optimal performance across diverse execution
|
||
different experiments. environments without requiring manual intervention.
|
||
This section assesses the proposed Auto Opt Mem in a realistic
|
||
• Comparison of latency and reward: Fig. 6a shows that decreasing serverless simulation environment. Four workloads were modeled as
|
||
latency generally leads to increasing reward. The reward value is representatives, including ML inference, API gateway, data processing,
|
||
higher at lower latencies (such as 84 and 85 milliseconds), but de and video processing, each independently defined by its memory range,
|
||
creases at higher latencies. baseline latency, and cost functions. A neural network with two hidden
|
||
• Comparison of cost and reward: Fig. 6b shows that while costs may layers (32 and 16 neurons) was used to train the PPO agent for 80 epi
|
||
be decreasing, the rewards are increasing, indicating a potentially sodes. The reward function incorporates normalized latency, cost, and
|
||
favorable outcome for the experiments. QoS terms that drive the policy to achieve a balanced optimization. An
|
||
• Comparison of quality of service and reward: Fig. 6c shows that autonomic MAPE-K (Monitor, Analyze, Plan, Execute – Knowledge) loop
|
||
increasing QoS usually increases reward. Quality of service in the was implemented at runtime to make the system self-adaptive. MAPE
|
||
range of 70–72 % has the highest reward value. continuously monitors the recent performance, analyzes QoS/cost
|
||
• Comparison of utilization and reward: Fig. 6d shows that higher trends, plans corrective actions such as tuning the memory, and executes
|
||
utilization usually leads to increased reward. The reward value in them through the adjustment of PPO policy parameters. The results are
|
||
creases sharply for values of 80 % and above. summarized in Table 8.
|
||
Auto Opt Mem demonstrates superior improvements across all
|
||
These graphs show that the optimized system tends to reduce la metrics compared to existing research, establishing its effectiveness.
|
||
tency, reduce cost, increase quality of service, and improve efficiency or Fig. 7 shows the Comparison with previous studies.
|
||
utilization to obtain maximum reward. The suggested reward function, Table 8 reports the average performance achieved by Auto Opt Mem
|
||
which incorporates delay, cost, utilization, and QoS, can effectively across all benchmark functions in terms of latency reduction, cost sav
|
||
balance the system’s various goals. Experiment 8 with the highest ings, and QoS improvement. These values are directly compared with
|
||
reward of 0.91 depicts the optimal balance between all the objectives. the results of Sizeless [17] and FnCapacitor [18]. Results demonstrate
|
||
Experiment 6 with the lowest reward of 0.03 is worse due to the high that, on average, the proposed Auto Opt Mem achieves 25–30 % less
|
||
delay and low utilization. The comparison graphs in Fig. 6 clearly show latency, 15–18 % less cost, and 10–12 % more QoS when compared to
|
||
that the improvement in each of the criteria, such as reducing delay and the Sizeless [17] baseline, while outperforming FnCapacitor [18] in all
|
||
cost, or increasing QoS and utilization, leads to an increase in reward. key metrics. This proves that PPO-based MAPE-K autonomic loop can
|
||
This means that the reward function design is suitable and can be used as dynamically adjust for the workload variability and provide optimized
|
||
a criterion for multi-objective optimization. resource allocation in serverless environments.
|
||
The results of 8 experiments conducted with different metrics are
|
||
5.3.3. Third scenario: comparison with previous studies presented in Table 9. Synthetic yet reproducible data were used, and the
|
||
To validate our results, we compared Auto Opt Mem with two papers dataset is openly provided for replication.
|
||
[17] and [18] that tackled memory optimization in serverless Table 10 summarizes the statistical comparison of Auto Opt Mem
|
||
computing. When compared to the works of Eismann et al. [17] and with the baseline methods. For each metric, the mean, standard devia
|
||
Jindal et al. [18], Auto Opt Mem achieves a more substantial reduction tion, and 95 % confidence interval were calculated. For instance, Auto
|
||
in execution latency, a greater cost reduction, and a significant Opt Mem has a latency of 267.92 ms ± 1.65 with a 95 % CI of (266.74,
|
||
improvement in QoS. Unlike previous studies that primarily relied on 269.10), outperforming both Sizeless [17] and FnCapacitor [18].
|
||
static function profiling or statistical estimations, Auto Opt Mem For statistical significance, we use an independent two-sample t-test.
|
||
continuously learns and adapts to varying workloads, and thus is more Indeed, our results indicate that the improvements of Auto Opt Mem
|
||
adaptive and scalable. Eismann et al. [17] developed a model for over Sizeless [17] are statistically significant for all metrics (p < 0.02).
|
||
resource prediction based on monitoring a single memory size, For FnCapacitor [18], the cost differences are significant with p = 0.035.
|
||
achieving performance gains but with limited adaptability to dynamic The statistical analysis confirms that Auto Opt Mem provides
|
||
workloads. Jindal et al. [18] introduced a statistical and deep learning consistently better performance, with several improvements being sig
|
||
approach to estimate function capacity, improving resource efficiency nificant at the 95 % confidence level.
|
||
The average latency, average cost, and average quality of service in
|
||
different methods are shown in the graph in Fig. 8.
|
||
Table 7 In Fig. 8a, Auto Opt Mem (green line) has the lowest latency in all
|
||
Reward function values in different experiments. tests, indicating better optimization in memory allocation and faster
|
||
Experiment Latency (ms) Cost ($) QoS ( %) Utilization ( %) Reward execution of serverless functions.
|
||
1 90 0.17 66 75 0.11 In Fig. 8b, right, Auto Opt Mem minimizes the cost of executing
|
||
2 88 0.16 63 78 0.43 functions in all experiments. Jindal et al. [18] method reduces the cost
|
||
3 92 0.18 67 74 0.8 compared to Eismann et al. [17] but is still higher than Auto Opt Mem.
|
||
4 87 0.15 70 80 0.65
|
||
In Fig. 8c, Auto Opt Mem (green line) has the highest Quality of
|
||
5 89 0.17 64 76 0.21
|
||
6 91 0.16 68 73 0.03 Service (QoS), indicating increased reliability and optimal performance
|
||
7 86 0.15 71 79 0.65 under different workload conditions. Jindal et al. [18] method provides
|
||
8 85 0.14 72 82 0.91 better QoS than Eismann et al. [17] but still falls short of Auto Opt
|
||
|
||
|
||
14
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
|
||
|
||
Fig. 5. Reward function values for eight experimental runs.
|
||
|
||
|
||
Mem’s. Table 11 further illustrates the differences of our approach from with regards to, for example, memory-to-vCPU coupling and pricing
|
||
previous methods. scheme, which closely aligns with Azure Functions and Google Cloud
|
||
As compared to static allocation strategies, Auto Opt Mem saves a lot Functions. Auto Opt Mem is provider-agnostic since platform-specific
|
||
of resource wastage by memory allocation according to actual function constraints, such as memory ranges and CPU scaling rules, are
|
||
requirements rather than over-provisioning. Such dynamic allocation embedded in the state representation. Thus, even though execution was
|
||
leads to cost saving in a large extent, especially in the case of varying simulated, the framework is transferable to real AWS Lambda de
|
||
workload or mixed kinds of functions. Moreover, the ability of Auto Opt ployments and other cloud providers as well.
|
||
Mem to minimize latency leads to improved application performance
|
||
and user experience. By efficient memory management and reducing 6.2. Potential application scenarios
|
||
cold start time, Auto Opt Mem promises that functions will execute both
|
||
quickly and reliably even in high-demand situations. In contrast to Due to resource and space limitations, we used controlled micro-
|
||
heuristic methods that typically operate on oversimplification assump benchmarks. Still, Auto Opt Mem is not restricted to this setup and
|
||
tions and struggle with dynamic states, Auto Opt Mem employs deep can also be used in real applications. For example, it helps with machine
|
||
reinforcement learning to discover decisions from real current states. learning inference tasks, such as running image or text classification
|
||
Such responsiveness is important in numerous serverless computing models where reducing cost and latency is important. It is also useful for
|
||
applications. While machine learning-based methods are somewhat video transcoding, since converting formats with tools such as ffmpeg
|
||
flexible, Auto Opt Mem is better at balancing conflicting objectives, e.g., usually takes a lot of CPU and memory. Another case is data pre
|
||
cost minimization and optimal performance. By means of a reward processing, where large datasets need to be read, compressed, or
|
||
function that considers several parameters, Auto Opt Mem gives an filtered. Auto Opt Mem also works well in API aggregation, when several
|
||
overall optimization of the system. external services are called at the same time and the process is mostly I/
|
||
O-bound.
|
||
6. Discussion These examples illustrate that Auto Opt Mem is workload-agnostic
|
||
and can be applied to real-world scenarios. A full-scale practical eval
|
||
In this section, we discuss about the AWS lambda platform, potential uation is considered for future research. Table 12 shows real-world
|
||
application scenarios, and then provide explanations about the scenarios where Auto Opt Mem can be used.
|
||
comparative analysis on the memory configuration in serverless
|
||
computing.
|
||
6.3. Comparative analysis
|
||
|
||
6.1. Use of AWS Table 13 summarizes recent research in intelligent cloud computing
|
||
that focuses on automation, optimization, and deep learning. This table
|
||
The experiments were conducted in a simulated environment; the shows how the proposed Auto Opt Mem framework aligns with these
|
||
workload behavior, latency patterns, and cost models were derived from studies and focuses on dynamic memory and performance optimization
|
||
documented AWS Lambda configuration rules. This ensures that the in serverless environments.
|
||
characteristics of a real AWS serverless environment are captured while
|
||
still allowing full reproducibility. 7. Conclusions
|
||
AWS Lambda serves as the conceptual reference platform due to its
|
||
very high market share and representative resource allocation model Memory configuration in serverless computing can be challenging
|
||
|
||
|
||
|
||
|
||
15
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
|
||
|
||
Fig. 6. Comparison of reward function and other metrics.
|
||
|
||
|
||
as Auto Opt Mem. The results show that Auto Opt Mem optimizes
|
||
Table 8
|
||
resource utilization, decreases operation costs and latency, and en
|
||
Comparison with previous studies.
|
||
hances quality of service (QoS), and hence it can be a perfect fit for
|
||
Approach Latency Cost Reduction QoS Improvement developers in serverless systems. Auto Opt Mem shows noticeable im
|
||
Reduction ( %) ( %) ( %)
|
||
provements over previous methods. Compared to Sizeless, it reduces
|
||
Auto Opt Mem vs 25–30 % 15–18 % 10–12 % latency by 25–30 %, lowers cost by 15–18 %, and improves QoS by
|
||
Sizeless [17]
|
||
10–12 %. Against FnCapacitor, it achieves 5–7 % latency reduction, 6–8
|
||
Auto Opt Mem vs 5–7 % 6–8 % 2–3 %
|
||
FnCapacitor [18] % cost reduction, and 2–3 % QoS improvement. Our experiments
|
||
demonstrate that Auto Opt Mem On average provides 16.8 % lower la
|
||
tency, 11.8 % cost reduction, and 6.8 % QoS improvement across both
|
||
due to the ephemeral nature of serverless functions, which are short- methods.
|
||
lived and stateless. This research examines memory configuration In our approach one of the important hyperparameters in deep
|
||
mechanisms and then classifies these mechanisms in serverless reinforcement learning is the learning rate, which controls the extent of
|
||
computing into three main approaches: machine learning-based, updates to the model’s weights. A high learning rate typically causes the
|
||
exploration-based, and framework-based approaches. The advantages model to update its weights more aggressively. A higher learning rate
|
||
and disadvantages of each mechanism, as well as the challenges and may cause inefficient memory usage because of unstable learning,
|
||
performance metrics affecting their effectiveness, are discussed. Mem leading to suboptimal allocation. But, a very small learning rate could
|
||
ory configuration is one of the important challenges in serverless lead to very slow training, Long-term training and slow convergence,
|
||
computing; In this paper, we propose an autonomous deep learning- leading to late optimization of memory. It can be said that a high
|
||
based serverless computing memory optimization system, referred to learning rate can achieve convergence quickly, but it may pass the
|
||
|
||
|
||
16
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
|
||
|
||
Fig. 7. Comparison with previous studies.
|
||
|
||
|
||
real-time adaptability under more diverse and large-scale workloads.
|
||
Table 9
|
||
Additionally, the scalability of Auto Opt Mem can also be researched on
|
||
Comparison of approaches across 8 experiments.
|
||
other serverless frameworks except AWS Lambda.
|
||
Approach Latency (ms) Cost ($) QoS ( %)
|
||
|
||
Sizeless [17] 362.85, 370.14, 0.00243, 0.00236, 80.15, 81.24, Funding
|
||
355.92, 380.18, 0.00248, 0.00244, 79.87, 80.45,
|
||
364.77, 359.60, 0.00241, 0.00237, 81.06, 80.83,
|
||
Funding was received for this work.
|
||
372.31, 368.54 0.00246, 0.00242 79.92, 80.61
|
||
FnCapacitor 278.42, 271.66, 0.00256, 0.00252, 89.44, 90.22,
|
||
All of the sources of funding for the work described in this publica
|
||
[18] 282.10, 276.74, 0.00257, 0.00255, 88.97, 89.75, tion are acknowledged below:
|
||
274.89, 280.13, 0.00251, 0.00253, 90.13, 89.61, [List funding sources and their role in study design, data analysis, and
|
||
273.80, 275.55 0.00259, 0.00254 89.88, 90.06 result interpretation]
|
||
Auto Opt 267.91, 270.34, 0.00221, 0.00219, 91.08, 91.46,
|
||
No funding was received for this work.
|
||
Mem 265.10, 268.05, 0.00223, 0.00220, 90.83, 91.21,
|
||
266.83, 269.14, 0.00218, 0.00222, 91.37, 90.97,
|
||
267.54, 266.41 0.00221, 0.00220 91.32, 91.15 Intellectual property
|
||
|
||
We confirm that we have given due consideration to the protection of
|
||
Table 10 intellectual property associated with this work and that there are no
|
||
Statistical comparison of Auto Opt Mem with baseline methods. impediments to publication, including the timing of publication, with
|
||
respect to intellectual property. In so doing we confirm that we have
|
||
Approach Metric Average Std. 95 % CI p-value
|
||
Dev. followed the regulations of our institutions concerning intellectual
|
||
property.
|
||
vs. Auto
|
||
Opt Mem
|
||
Sizeless [17] Latency 366.29 8.18 (360.43, 0.002 Research ethics
|
||
(ms) 372.15)
|
||
Cost ($) 0.00243 0.00004 (0.00240, 0.018 We further confirm that any aspect of the work covered in this
|
||
0.00246)
|
||
manuscript that has involved human patients has been conducted with
|
||
QoS ( %) 80.52 0.43 (80.22, 0.001
|
||
80.82) the ethical approval of all relevant bodies and that such approvals are
|
||
FnCapacitor Latency 276.29 3.29 (273.97, 0.120 acknowledged within the manuscript.
|
||
[18] (ms) 278.61) IRB approval was obtained (required for studies and series of 3 or
|
||
Cost ($) 0.00255 0.00003 (0.00253, 0.035
|
||
|
||
more cases)
|
||
0.00257)
|
||
QoS ( %) 89.88 0.41 (89.59, 0.280
|
||
Written consent to publish potentially identifying information, such
|
||
90.17) as details or the case and photographs, was obtained from the patient(s)
|
||
Auto Opt Latency 267.92 1.65 (266.74, – or their legal guardian(s).
|
||
Mem (ms) 269.10)
|
||
Cost ($) 0.00221 0.00002 (0.00220,
|
||
Authorship
|
||
–
|
||
0.00222)
|
||
QoS ( %) 91.17 0.24 (91.00, –
|
||
91.34) The International Committee of Medical Journal Editors (ICMJE)
|
||
recommends that authorship be based on the following four criteria:
|
||
|
||
optimal point. As a result, it will lead to irregular updates and require 1. Substantial contributions to the conception or design of the work; or
|
||
additional iterations to correct errors, which will increase memory the acquisition, analysis, or interpretation of data for the work; AND
|
||
consumption. And a low learning rate helps in more stable and gradual 2. Drafting the work or revising it critically for important intellectual
|
||
convergence, but may require more periods to reach convergence, which content; AND
|
||
can increase the memory load due to long-term storage of intermediate 3. Final approval of the version to be published; AND
|
||
states. Future research directions include extending Auto Opt Mem to
|
||
multi-cloud environments, extending and stress-testing Auto Opt Mem’s
|
||
|
||
|
||
17
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
|
||
|
||
Fig. 8. Compare performance metrics.
|
||
|
||
|
||
4. Agreement to be accountable for all aspects of the work in ensuring been approved by all named authors.
|
||
that questions related to the accuracy or integrity of any part of the
|
||
work are appropriately investigated and resolved. Contact with the editorial office
|
||
|
||
All those designated as authors should meet all four criteria for The Corresponding Author declared on the title page of the manu
|
||
authorship, and all who meet the four criteria should be identified as script is:
|
||
authors. For more information on authorship, please see https://www. [Mostafa Ghobaei-Arani]
|
||
icmje. This author submitted this manuscript using his/her account in
|
||
org/recommendations/browse/roles-and-responsibilities/defining-th EVISE.
|
||
e-role-of-authors-and-contributors.html#two. We understand that this Corresponding Author is the sole contact for
|
||
All listed authors meet the ICMJE criteria. We attest that all authors the Editorial process (including EVISE and direct communications with
|
||
contributed significantly to the creation of this manuscript, each having the office). He/she is responsible for communicating with the other
|
||
fulfilled criteria as established by the ICMJE. authors about progress, submissions of revisions and final approval of
|
||
One or more listed authors do(es) not meet the ICMJE criteria. proofs.
|
||
We believe these individuals should be listed as authors because: We confirm that the email address shown below is accessible by the
|
||
[Please elaborate below] Corresponding Author, is the address to which Corresponding Author’s
|
||
We confirm that the manuscript has been read and approved by all EVISE account is linked, and has been configured to accept email from
|
||
named authors. the editorial office of American Journal of Ophthalmology Case Reports:
|
||
We confirm that the order of authors listed in the manuscript has Someone other than the Corresponding Author declared above
|
||
|
||
|
||
18
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
|
||
Table 11 submitted this manuscript from his/her account in EVISE:
|
||
Differentiation of our approach from previous studies. [Insert name below]
|
||
Aspect Sizeless [17] FnCapacitor [18] Our Work We understand that this author is the sole contact for the Editorial
|
||
process (including EVISE and direct communications with the office).
|
||
(Auto Opt Mem)
|
||
Methodology • Multi-target • Sandboxing, • Deep He/she is responsible for communicating with the other authors,
|
||
regression using performance tests, Reinforcement including the Corresponding Author, about progress, submissions of
|
||
monitoring data and statistical/ Learning with MAPE revisions and final approval of proofs.
|
||
from a single DNN modeling control loop
|
||
memory size
|
||
Decision • Predicts execution • Estimate’s • Learns and selects CRediT authorship contribution statement
|
||
Type time and cost for function capacity optimal memory
|
||
other memory sizes (max concurrency configuration Zahra Shojaee Rad: Resources, Methodology, Investigation, Fund
|
||
under SLO)
|
||
ing acquisition, Formal analysis, Data curation, Conceptualization.
|
||
Adaptivity • Static once • Requires offline • Dynamic and
|
||
trained, no profiling for continuous Mostafa Ghobaei-Arani: Writing – review & editing, Writing – original
|
||
continuous learning changes adaptation at draft, Visualization, Validation, Supervision, Software, Resources,
|
||
runtime Project administration. Reza Ahsan: Writing – review & editing,
|
||
Focus • Memory- • Function • Memory Writing – original draft.
|
||
performance trade- capacity and optimization
|
||
offs concurrency balancing latency,
|
||
cost, QoS, and
|
||
utilization Declaration of competing interest
|
||
Limitation • No runtime • Re-profiling —
|
||
adaptability needed for new Potential conflict of interest exists:
|
||
workloads
|
||
Innovation • Efficient • Accurate FC • Self-adaptive and
|
||
We wish to draw the attention of the Editor to the following facts,
|
||
prediction with estimation for autonomous which may be considered as potential conflicts of interest, and to sig
|
||
limited input functions optimization nificant financial contributions to this work:
|
||
The nature of potential conflict of interest is described below:
|
||
No conflict of interest exists.
|
||
Table 12 The authors declare that they have no known competing financial
|
||
Application Scenarios for Auto Opt Mem. interests or personal relationships that could have appeared to influence
|
||
Scenario Type of Workload Role of Auto Opt Mem
|
||
the work reported in this paper.
|
||
|
||
• ML inference • CPU-bound • Optimizes latency and cost by
|
||
Data availability
|
||
adjusting memory/CPU
|
||
• Video transcoding • CPU & memory- • Balances higher memory cost with
|
||
intensive faster execution time Data will be made available on request.
|
||
• Data preprocessing • Mixed (CPU + I/ • Adapts memory allocation based on
|
||
(ETL) O) input size
|
||
• API aggregation • I/O-bound • Keeps memory low while ensuring References
|
||
QoS in parallel API calls
|
||
[1] Ioana Baldini, Paul Castro, Kerry Chang, Perry Cheng, Stephen Fink,
|
||
Vatche Ishakian, Nick Mitchell, et al., Serverless computing: current trends and
|
||
open problems, Res. Adv. Cloud Comput. (2017) 1–20.
|
||
[2] A. Ebrahimi, M. Ghobaei-Arani, H. Saboohi, Cold start latency mitigation
|
||
Table 13
|
||
mechanisms in serverless computing: taxonomy, review, and future directions,
|
||
Comparative analysis with recent studies in the field of intelligent cloud J. Syst. Archit. 151 (2024) 103115.
|
||
computing. [3] AWS, “Serverlessvideo: connect with users around the world!.” https://serverless
|
||
land.com/, 2023.
|
||
Ref Focus Area Technique Key Relation to
|
||
[4] AWS, “Serverless case study - netflix.” https://dashbird.io/blog/serverless-case-s
|
||
Contribution Present Work
|
||
tudy-netflix/, 2020.
|
||
[59] Secure data Convergent Reduces Our DRL approach [5] CapitalOne, “Capital one saves developer time and reduces costs by going
|
||
deduplication encryption redundancy and similarly targets serverless on aws.” https://aws.amazon.com/solutions/case-studies/capital-on
|
||
ensures secure resource e-lambda-ecs-case-study/, 2023.
|
||
cloud storage efficiency but in [6] E. Johnson, “Deploying ml models with serverless templates.” https://aws.amazon.
|
||
com/blogs/compute/deploying-machine-learning-models-with-serverless-tem
|
||
serverless memory
|
||
plates/, 2021.
|
||
optimization
|
||
[7] A. Sojasingarayar, “Build and deploy llm application in aws.” https://medium.
|
||
[60] Cloud key Machine Intelligent key Both emphasize com/@abonia/build-and-deploy-llm-application-in-aws-cca46c662749, 2024.
|
||
management learning- lifecycle intelligent [8] A. Gholami, M. Ghobaei-Arani, A trust model based on quality of service in cloud
|
||
based security management automation for computing environment, Int. J. Database Theor. Appl. 8 (5) (2015) 161–170,
|
||
framework secure and https://doi.org/10.14257/ijdta.2015.8.5.13.
|
||
efficient cloud [9] DataDog. 2020. The State of Serverless. https://www.datadoghq.com/state-of-se
|
||
operations rverless/.
|
||
[61] Deep learning Deep learning Provides insight Inspires our DL- [10] M. Tari, M. Ghobaei-Arani, J. Pouramini, M. Ghorbian, Auto-scaling mechanisms in
|
||
for cloud/ models survey into intelligent driven resource serverless computing: a comprehensive review, Comput. Sci. Rev. 53 (2024)
|
||
edge/fog/IoT distributed optimization 100650, https://doi.org/10.1016/j.cosrev.2024.100650.
|
||
learning framework [11] M. Ghorbian, M. Ghobaei-Arani, R. Asadolahpour-Karimi, Function placement
|
||
paradigms approaches in serverless computing: A survey, J. Syst. Archit. 157 (2024) 103291.
|
||
[12] B. Jacob, R. Lanyon-Hogg, D.K. Nadgir, A.F. Yassin, A Practical Guide to the IBM
|
||
[62] Cloud security Deep Enhances Our work extends
|
||
Autonomic Computing toolkit. IBM, International Technical Support Organization,
|
||
and privacy learning- privacy and this intelligence
|
||
2004.
|
||
based attack adaptive threat toward
|
||
[13] Michael Maurer, Ivan Breskovic, Vincent C. Emeakaroha, Ivona Brandic, Revealing
|
||
detection response performance the MAPE loop for the autonomic management of cloud infrastructures, in: 2011
|
||
optimization and IEEE Symposium on Computers and Communications (ISCC), IEEE, 2011,
|
||
QoS improvement pp. 147–152.
|
||
in serverless [14] Russell, Stuart J., and Peter Norvig. Artificial intelligence: a modern approach.
|
||
systems pearson, 2016.
|
||
[15] Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore, Reinforcement
|
||
learning: a survey, J. Artif. Intell. Res. 4 (1996) 237–285.
|
||
|
||
|
||
19
|
||
Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098
|
||
|
||
[16] Rajkumar Rajavel, Mala Thangarathanam, Adaptive probabilistic behavioural [40] Zahra Shojaee rad, Mostafa Ghobaei-Arani, Reza Ahsan, Memory orchestration
|
||
learning system for the effective behavioural decision in cloud trading negotiation mechanisms in serverless computing: a taxonomy, review and future directions,
|
||
market, Fut. Gener. Comput. Syst. 58 (2016) 29–41. Cluster. Comput. (2024) 1–27.
|
||
[17] Simon Eismann, Long Bui, Johannes Grohmann, Cristina Abad, Nikolas Herbst, [41] R. Wolski, C. Krintz, F. Bakir, G. George, W.-T. Lin, Cspot: portable, multi-scale
|
||
Samuel Kounev, Sizeless: predicting the optimal size of serverless functions, in: functions-as-a-service for iot, in: Proceedings of the 4th ACM/IEEE Symposium on
|
||
Proceedings of the 22nd International Middleware Conference, 2021, pp. 248–259. Edge Computing (SEC ‘19). Association for Computing Machinery, New York,
|
||
[18] Anshul Jindal, Mohak Chadha, Shajulin Benedict, Michael Gerndt, Estimating the 2019, pp. 236–249, https://doi.org/10.1145/3318216.3363314.
|
||
capacities of function-as-a-service functions, in: In Proceedings of the 14th IEEE/ [42] V. Yussupov, U. Breitenbucher, F. Leymann, M. Wurster, A systematic mapping
|
||
ACM International Conference on Utility and Cloud Computing Companion, 2021, study on engineering function-as-a-service platforms and tools, in: Proceedings of
|
||
pp. 1–8. the 12th IEEE/ACM International Conference on Utility and Cloud Computing
|
||
[19] Djob Mvondo, Mathieu Bacou, Kevin Nguetchouang, Lucien Ngale, (UCC’19). Association for Computing Machinery, New York, 2019, pp. 229–240,
|
||
Stéphane Pouget, Josiane Kouam, Renaud Lachaize, et al., OFC: an opportunistic https://doi.org/10.1145/3344341.3368803.
|
||
caching system for FaaS platforms, in: Proceedings of the Sixteenth European [43] Zahra Shojaee Rad, Mostafa Ghobaei-Arani, Federated serverless cloud approaches:
|
||
Conference on Computer Systems, 2021, pp. 228–244. a comprehensive review, Comput. Electric. Eng. 124 (2025) 110372.
|
||
[20] Myung-Hyun Kim, Jaehak Lee, Heonchang Yu, Eunyoung Lee, Improving memory [44] Ioana Baldini, Paul Castro, Kerry Chang, Perry Cheng, Stephen Fink,
|
||
utilization by sharing DNN models for serverless inference, in: 2023 IEEE Vatche Ishakian, Nick Mitchell, et al., Serverless computing: current trends and
|
||
International Conference on Consumer Electronics (ICCE), IEEE, 2023, pp. 1–6. open problems, Res. Adv. Cloud Comput. (2017) 1–20.
|
||
[21] Agarwal, Siddharth, Maria A. Rodriguez, and Rajkumar Buyya. "Input-based [45] M. Elsakhawy, M. Bauer, Faas2f: a framework for autoining execution-sla in
|
||
ensemble-learning method for dynamic memory configuration of serverless serverless computing, in: 2020 IEEE Cloud Summit, 2020, pp. 58–65, https://doi.
|
||
computing functions." arXiv preprint arXiv:2411.07444 (2024). org/10.1109/IEEECloudSummit48914.2020.00015.
|
||
[22] Gor Safaryan, Anshul Jindal, Mohak Chadha, Michael Gerndt, SLAM: sLO-aware [46] A.U. Gias, G. Casale, Cocoa: cold start aware capacity planning for function-as-a-
|
||
memory optimization for serverless applications, in: 2022 IEEE 15th International service platforms, in: 2020 28th International Symposium on Modeling, Analysis,
|
||
Conference on Cloud Computing (CLOUD), IEEE, 2022, pp. 30–39. and Simulation of Computer and Telecommunication Systems (MASCOTS), 2020,
|
||
[23] Robert Cordingly, Sonia Xu, Wes Lloyd, Function memory optimization for pp. 1–8, https://doi.org/10.1109/MASCOTS50786.2020.9285966.
|
||
heterogeneous serverless platforms with cpu time accounting, in: 2022 IEEE [47] C.K. Dehury, S.N. Srirama, T.R. Chhetri, Ccodamic: a framework for coherent
|
||
International Conference on Cloud Engineering (IC2E), IEEE, 2022, pp. 104–115. coordination of data migration and computation platforms, Futur. Gener. Comput.
|
||
[24] Tetiana Zubko, Anshul Jindal, Mohak Chadha, Michael Gerndt, Maff: self-adaptive Syst. 109 (2020) 1–16, https://doi.org/10.1016/j.future, 2020.03.029.
|
||
memory optimization for serverless functions, in: European Conference on Service- [48] A. Tariq, A. Pahl, S. Nimmagadda, E. Rozner, S. Lanka, Sequoia: enabling quality-
|
||
Oriented and Cloud Computing, Cham: Springer International Publishing, 2022, of-service in serverless computing, in: Proceedings of the 11th ACM Symposium on
|
||
pp. 137–154. Cloud Computing (SoCC ’20). Association for Computing Machinery, New York,
|
||
[25] Josef. Spillner, Resource management for cloud functions with memory tracing, 2020, pp. 311–327, https://doi.org/10.1145/3419111.3421306.
|
||
profiling and autotuning, in: Proceedings of the 2020 Sixth International Workshop [49] J. Manner, S. Kolb, G. Wirtz, Troubleshooting serverless functions: a combined
|
||
on Serverless Computing, 2020, pp. 13–18. monitoring and debugging approach, SICS Softw.-Intensiv. Cyber-Phys. Syst. 34 (2)
|
||
[26] Zengpeng Li, Huiqun Yu, Guisheng Fan, Time-cost efficient memory configuration (2019) 99–104, https://doi.org/10.1007/s00450-019-00398-6.
|
||
for serverless workflow applications, Concurr. Comput.: Pract. Exp. 34 (27) (2022) [50] J. Nupponen, D. Taibi, Serverless: what it is, what to do and what not to do, in:
|
||
e7308, no. 2020 IEEE International Conference on Software Architecture Companion (ICSA-
|
||
[27] Andrea Sabbioni, Lorenzo Rosa, Armir Bujari, Luca Foschini, Antonio Corradi, C), 2020, pp. 49–50, https://doi.org/10.1109/ICSA-C50368.2020.00016.
|
||
A shared memory approach for function chaining in serverless platforms, in: 2021 [51] G. Cordasco, M. D’Auria, A. Negro, V. Scarano, C. Spagnuolo, Fly: a domain-
|
||
IEEE Symposium on Computers and Communications (ISCC), IEEE, 2021, pp. 1–6. specific language for scientific computing on faas, in: U. Schwardmann, C. Boehme,
|
||
[28] Aakanksha Saha, Sonika Jindal, EMARS: efficient management and allocation of B. Heras D, V. Cardellini, E. Jeannot, A. Salis, C. Schifanella, R.R. Manumachu,
|
||
resources in serverless, in: 2018 IEEE 11th International Conference on Cloud D. Schwamborn, L. Ricci, O. Sangyoon, T. Gruber, L. Antonelli, S.L. Scott (Eds.),
|
||
Computing (CLOUD), IEEE, 2018, pp. 827–830. Euro-Par 2019: Parallel Processing Workshops, Springer, Cham, 2020,
|
||
[29] Amit Samanta, Faraz Ahmed, Lianjie Cao, Ryan Stutsman, Puneet Sharma, pp. 531–544.
|
||
Persistent memory-aware scheduling for serverless workloads, in: 2023 IEEE [52] B. Jambunathan, K. Yoganathan, Architecture decision on using microservices or
|
||
International Parallel and Distributed Processing Symposium Workshops serverless functionswith containers, in: 2018 International Conference on Current
|
||
(IPDPSW), IEEE, 2023, pp. 615–621. Trends Towards Converging Technologies (ICCTCT), 2018, pp. 1–7, https://doi.
|
||
[30] Meenakshi Sethunath, Yang Peng, A joint function warm-up and request routing org/10.1109/ICCTCT.2018.8551035.
|
||
scheme for performing confident serverless computing, High-Confidence Comput. [53] A. Keshavarzian, S. Sharifian, S. Seyedin, Modified deep residual network
|
||
2 (3) (2022) 100071 no. architecture deployed on serverless framework of iot platform based on human
|
||
[31] Anisha Kumari, Manoj Kumar Patra, Bibhudatta Sahoo, Ranjan Kumar Behera, activity recognition application, Futur. Gener. Comput. Syst. 101 (2019) 14–28,
|
||
Resource optimization in performance modeling for serverless application, Int. J. https://doi.org/10.1016/j.future.2019.06.009.
|
||
Inf. Technol. 14 (6) (2022) 2867–2875, no. [54] Gerald. Tesauro, Temporal difference learning and TD-gammon, Commun. ACM 38
|
||
[32] Vahldiek-Oberwagner, Anjo, and Mona Vij. "Meshwa: the case for a memory-safe (3) (1995) 58–68.
|
||
software and hardware architecture for serverless computing." arXiv preprint [55] Eric Rutten, Nicolas Marchand, Daniel Simon, Feedback control as MAPE-K loop in
|
||
arXiv:2211.08056 (2022). autonomic computing. Software Engineering for Self-Adaptive Systems III.
|
||
[33] Divyanshu Saxena, Tao Ji, Arjun Singhvi, Junaid Khalid, Aditya Akella, Memory Assurances: International Seminar, Dagstuhl Castle, Germany, December 15-19,
|
||
deduplication for serverless computing with medes, in: Proceedings of the 2013, Revised Selected and Invited Papers, Springer International Publishing,
|
||
Seventeenth European Conference on Computer Systems, 2022, pp. 714–729. Cham, 2018, pp. 349–373.
|
||
[34] Jie Li, Laiping Zhao, Yanan Yang, Kunlin Zhan, Keqiu Li, Tetris: memory-efficient [56] Evangelina Lara, Leocundo Aguilar, Mauricio A. Sanchez, Jesús A. García, Adaptive
|
||
serverless inference through tensor sharing, in: 2022 USENIX Annual Technical security based on mape-k: a survey. Applied Decision-Making: Applications in
|
||
Conference (USENIX ATC 22), 2022. Computer Sciences and Engineering, Springer International Publishing, Cham,
|
||
[35] Dmitrii Ustiugov, Plamen Petrov, Marios Kogias, Edouard Bugnion, Boris Grot, 2019, pp. 157–183.
|
||
Benchmarking, analysis, and optimization of serverless function snapshots, in: [57] Jeffrey O. Kephart, David M. Chess, The vision of autonomic computing, Computer.
|
||
Proceedings of the 26th ACM International Conference on Architectural Support (Long. Beach. Calif) 36 (1) (2003) 41–50.
|
||
for Programming Languages and Operating Systems, 2021, pp. 559–572. [58] Alistair McLean, Roy Sterritt, Autonomic Computing in the Cloud: an overview of
|
||
[36] Ao Wang, Jingyuan Zhang, Xiaolong Ma, Ali Anwar, Lukas Rupprecht, past, present and future trends, in: The 2023 IARIA Annual Congress on Frontiers
|
||
Dimitrios Skourtis, Vasily Tarasov, Feng Yan, Yue Cheng, {InfiniCache}: exploiting in Science, Technology, Services, and Applications: Technical Advances and
|
||
ephemeral serverless functions to build a {cost-effective} memory cache, in: 18th Human Consequences, 2023.
|
||
USENIX conference on file and storage technologies (FAST 20), 2020, pp. 267–281. [59] Shahnawaz Ahmad, Mohd Arif, Javed Ahmad, Mohd Nazim, Shabana Mehfuz,
|
||
[37] Anurag Khandelwal, Yupeng Tang, Rachit Agarwal, Aditya Akella, Ion Stoica, Jiffy: Convergent encryption enabled secure data deduplication algorithm for cloud
|
||
elastic far-memory for stateful serverless analytics, in: Proceedings of the environment, Concurr. Computat.: Pract. Exp. 36 (21) (2024) e8205.
|
||
Seventeenth European Conference on Computer Systems, 2022, pp. 697–713. [60] Shahnawaz Ahmad, Shabana Mehfuz, Shabana Urooj, Najah Alsubaie, Machine
|
||
[38] Nikolos, Orestis Lagkas, Chloe Alverti, Stratos Psomadakis, Georgios Goumas, and learning-based intelligent security framework for secure cloud key management,
|
||
Nectarios Koziris. "Fast and efficient memory reclamation for serverless Cluster. Comput. 27 (5) (2024) 5953–5979.
|
||
MicroVMs." arXiv preprint arXiv:2411.12893 (2024). [61] Shahnawaz Ahmad, Iman Shakeel, Shabana Mehfuz, Javed Ahmad, Deep learning
|
||
[39] Zahra Shojaee Rad, Mostafa Ghobaei-Arani, Data pipeline approaches in serverless models for cloud, edge, fog, and IoT computing paradigms: survey, recent
|
||
computing: a taxonomy, review, and research trends, J. Big. Data 11 (1) (2024) advances, and future directions, Comput. Sci. Rev. 49 (2023) 100568.
|
||
1–42, no. [62] Shahnawaz Ahmad, Mohd Arif, Shabana Mehfuz, Javed Ahmad, Mohd Nazim,
|
||
Deep learning-based cloud security: innovative attack detection and privacy
|
||
focused key management, IEEE Trans. Comput. (2025).
|
||
|
||
|
||
|
||
|
||
20
|
||
|