opaque-lattice/papers_txt/An-autonomous-deep-reinforcement-learning-based-approac_2026_Computer-Standa.txt

                                                              Computer Standards & Interfaces 97 (2026) 104098


                                                                   Contents lists available at ScienceDirect


                                                       Computer Standards & Interfaces
                                                            journal homepage: www.elsevier.com/locate/csi


An autonomous deep reinforcement learning-based approach for memory
configuration in serverless computing
Zahra Shojaee Rad , Mostafa Ghobaei-Arani * , Reza Ahsan
Department of Computer Engineering, Qo.C., Islamic Azad University, Qom, Iran


A R T I C L E I N F O                                     A B S T R A C T

Keywords:                                                 Serverless computing has become very popular in recent years due to its cost savings and flexibility. Serverless
Serverless computing                                      computing is a cloud computing model that allows developers to create and deploy code without having to
Memory configuration                                      manage the infrastructure. It has been embraced due to its scalability, cost savings, and ease of use. However,
Deep reinforcement learning
                                                          memory configuration is one of the important challenges in serverless computing due to the transient nature of
Autonomous computing
Function-as-a-service
                                                          serverless functions, which are stateless and ephemeral. In this paper, we propose an autonomous approach using
                                                          deep reinforcement learning and a reward mechanism for memory configuration called Auto Opt Mem. In the
                                                          Auto Opt Mem mechanism, the system learns to allocate memory resources to serverless functions in a way that
                                                          balances overall performance and minimizes wastage of resources. Finally, we validate the effectiveness of our
                                                          solution, the findings revealed that Auto Opt Mem mechanism enhances resource utilization, reduces operation
                                                          cost and latency, and improves quality of service (QoS). Our experiments demonstrate that Auto Opt Mem
                                                          mechanism achieves 16.8 % lower latency compared to static allocation, 11.8 % cost reduction, and 6.8 %
                                                          improve in QoS, resource utilization, and efficient memory allocation compared with base-line methods.


1. Introduction                                                                           serverless functions in production use the default memory size, indi
                                                                                          cating that developers often overlook the importance of resource size
    Serverless computing has emerged as an extended cloud computing                       [9]. Traditional memory configuration methods are often manual set
model that offers many advantages in flexibility, scalability and cost                    tings or static allocation, which may lead to inefficiencies such as
efficiency [1]. By separating the management of the underlying infra                     overprovisioning or underutilization. These inefficiencies can lead to
structure, developers can focus on writing and deploying code without                     increased costs or decreased performance and affect the effectiveness of
worrying about server provisioning or maintenance. There has been a lot                   the serverless function. By employing deep learning models, memory
of progress in various areas related to serverless computing [2]. One                     configuration can be automated, leading to an efficient solution. Deep
aspect is Function as a Service (FaaS) that increasingly been associated                  learning models analyze historical data to dynamically predict optimal
with a variety of applications, including video streaming platforms [3],                  memory settings, adapt to varying workloads, and minimize latency and
multimedia processing [4], Continuous Integration/Continuous                              cost. This approach uses the ability of deep learning to identify complex
Deployment (CI/CD) pipelines [5], Artificial Intelligence/Machine                         patterns and relationships in data, enabling more accurate and efficient
Learning (AI/ML) inference tasks [6], and query processing for Large                      resource management.
Language Models (LLMs) [7]. FaaS is a serverless cloud computing                              Recent research shows the importance of intelligent and autonomous
model that allows developers to run small, manageable services in iso                    systems in different domains. For instance, Arduino-based IoT automa
lated environments called function instances [8].                                         tion systems have shown how lightweight and adaptive architectures
    Despite these advantages, memory configuration in serverless envi                    can improve efficiency and minimize manual intervention in con
ronments is a complex challenge due to the transient and stateless nature                 strained environments [10]. Likewise, autonomous AI frameworks for
of serverless functions. Choosing the right amount of memory or                           fraud detection in the Dark Web demonstrate the ability of self-learning
resource size is important and a challenge because it can result in faster                mechanisms to adapt to dynamic and unpredictable conditions [11].
execution times and lower costs. A recent survey found that 47 % of                       These advances further motivate the need for AI-driven autonomic


 * Corresponding author.
   E-mail address: mo.ghobaei@iau.ac.ir (M. Ghobaei-Arani).

https://doi.org/10.1016/j.csi.2025.104098
Received 2 May 2025; Received in revised form 16 November 2025; Accepted 17 November 2025
Available online 19 November 2025
0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
Z. Shojaee Rad et al.                                                                                               Computer Standards & Interfaces 97 (2026) 104098


solutions, such as our proposed Auto Opt Mem framework for serverless             1.4. Paper organization
memory configuration.
                                                                                      This paper is organized into several sections: Section 2 reviews
1.1. Research gap and motivation                                                  related memory configuration methods. Section 3 offers background
                                                                                  information. Section 4 presents a comprehensive explanation of the
    Previous approaches are often static or limited to a specific platform.       proposed solution. Section 5 assesses and discusses the experimental
The need for real-time adaptability to changing workloads is not met,             results. Section 6 presents the discussion. Section 7 presents the con
and they have relied solely on statistical modeling and lacked the ability        clusions with our findings and outlines future research directions.
to adapt in real time. The direct relationship between cost, latency, and
QoS has rarely been considered in a comprehensive framework. Our                  2. Related works
work addresses these research gaps by introducing Auto Opt Mem based
on MAPE loop and DRL.                                                                 This section discusses memory configuration approaches in server
    The motivation for this study is that many functions still use default        less computing. These approaches are categorized into three main
memory values, resulting in wasted resources, increased costs, and                groups: machine learning-based approaches, heuristic-based ap
reduced quality of service (QoS). For example, 47 % of users rely on              proaches, and framework-based approaches.
default memory settings, which emphasizes the importance of intelli
gent and adaptive optimization [9]. Addressing this gap is essential to           2.1. Machine learning-based approaches
improve performance and cost-effectiveness in serverless environments.
                                                                                      Simon Eismann et al. [17] have presented an approach called
                                                                                  "Sizeless" for predicting the optimal resource size for serverless functions
1.2. Our approach                                                                 in cloud computing, based on monitoring data from a single memory
                                                                                  size. It highlights the challenges developers face in selecting resource
    In this paper, we propose an autonomous memory configuration with             sizes and shows that the method can achieve an average prediction error
deep learning model to predict memory setting based on a combination              of 15.3 %, optimizing memory allocation for 79 % of functions, resulting
of the concept of the autonomic computing and the deep reinforcement              in a 39.7 % speedup and a 2.6 % cost reduction. Anshul Jindal et al. [18]
learning (DRL) with the aim of increasing performance and cost-                   have presented a tool called FnCapacitor for estimating the Function
effectiveness. To realize autonomic computing, IBM has introduced a               Capacity (FC) of Function-as-a-Service (FaaS) functions in serverless
reference framework for autonomic control loops known as the MAPE                 computing environments. It overcomes performance challenges due to
(Monitor, Analyze, Plan, Execute) loop [12,13]. This control MAPE loop            system abstractions and dependencies between functions. Through load
resembles the general agent model put forth by Russell and Norvig [14],           testing and modeling, FnCapacitor provides accurate FC predictions
where an intelligent agent observes its surroundings through sensors              using statistical methods and deep learning, demonstrating effectiveness
and utilizes these observations to decide on actions to take within that          on platforms like AWS Lambda and Google Cloud Functions. Djob
environment. The proposed approach follows the control MAPE loop,                 Mvondo et al. [19] have presented OFC, an in-memory caching system
which consists of four phases: monitoring (M), analysis (A), planning             designed for Functions as a Service (FaaS) platforms to improve per
(P), and execution (E). First in the monitoring phase, the system ob             formance by reducing latency during data access. It leverages machine
serves and collects the current state of the environment (memory in               learning to predict memory requirements for function invocations, uti
serverless functions). In the analysis phase, the agent which is a deep           lizing otherwise wasted memory from over-provisioning and idle sand
neural network analyzes the observed state and updates its policy based           boxes. OFC demonstrates significant execution time improvements for
on it and the reward received. In the planning phase, the agent schedules         both single-stage and pipelined functions, enhancing efficiency without
an action (i.e., memory configuration) based on the learned policy. In            requiring changes to existing application code. Myung-Hyun Kim et al.
the execution phase, the scheduled action is applied to the environment.          [20] have introduced ShmFaas, a serverless platform designed to
We utilize Deep Reinforcement Learning (DRL) [15,16] as a                         improve memory utilization for deep neural network (DNN) inference
decision-making tool that leverages the predicted outcomes from the               by sharing models in-memory across containers. They address data
analysis phase to determine the best memory configuration during the              duplication and cold start issues, particularly in resource-constrained
planning phase. Reinforcement Learning (RL) is a self-learning system             edge cloud environments. Experimental results show that ShmFaas re
that enhances its effectiveness by continuously interacting with the              duces memory usage by over 29.4 % compared to common systems,
cloud environment.                                                                while maintaining negligible latency overhead and enhancing
                                                                                  throughput. Siddharth Agarwal et al. [21] have presented MemFigLess,
1.3. Main contributions                                                           an input-aware memory allocation framework for serverless computing
                                                                                  functions, designed to resource usage and reduce costs. Using a
    The main contributions of this research can be summarized as                  multi-output Random Forest Regression model, it correlates the input
follows:                                                                          features of the function with memory requirements, leading to accurate
                                                                                  memory configuration. The evaluation shows that MemFigLess can
 • We propose an autonomic method using deep reinforcement learning               significantly decrease resource allocation and save on runtime costs.
   method to predict memory configuration, and this method operates               Finally, Table 1 shows a comparison of machine learning-based ap
   based on a reward mechanism.                                                   proaches from related studies.
 • We designed a multi-objective reward normalization mechanism that
   simultaneously balances latency, cost, utilization, and QoS.                   2.2. Heuristic-based approaches
 • We integrated the MAPE-K control loop with deep reinforcement
   learning (DRL) to enable closed-loop online adaptation.                           Goor-Safarian et al. [22], have present SLAM, a tool for optimizing
 • Auto Opt Mem supports real-time continuous learning across varying             memory settings for serverless applications consisting of multiple
   workloads, which clearly differentiates it from static or offline ML-          Function-as-a-Service (FaaS) functions. It addresses the issues in
   based predictors.                                                              balancing cost and performance while meeting Service Level Objectives
 • Experiments validate the effectiveness of the proposed method and              (SLOs). By utilizing distributed tracing, SLAM estimates execution times
   demonstrate performance improvements in metrics such as latency                under various memory settings and identifies optimal configurations.
   and cost.                                                                      Robert Cordingly et al. [23] presented a method called CPU Time

                                                                              2
Z. Shojaee Rad et al.                                                                                                         Computer Standards & Interfaces 97 (2026) 104098


Table 1
Comparison of machine learning-based approaches.
  Ref      Metric                     Method                             Advantage                             Disadvantage                       Tools

  [17]     • Execution time           • Multi-target regression model    • Reduces execution time              • Limited to specific cloud        • AWS Lambda
                                                                                                               providers
          • Resource consumption     • Monitoring data                  • Decrease cost
          • Performance overhead     • Node.js
                                                                      • Without requiring performance
                                                                         test
  [18]     • Function Capacity (FC)   • Statistical and deep learning    • Accurate FC predictions             • Limited to specific FaaS         • Python
                                      approaches                                                               platforms
                                                                                                                                             • FnCapacitor
                                                                                                                                             • Google Cloud Functions
                                                                                                                                                  (GCF)
                                                                                                                                             • AWS Lambda
  [19]     • Function Capacity (FC)   • OFC tool uses machine learning   • Reduction in execution time Cost-   • Overhead from cache              • Python
                                                                         effective                             management
                                    • Utilizes idle memory             • Transparent                                                           • Java
                                                                                                                                             • OFC
                                                                                                                                             • AWS
  [20]     • Efficiency of memory     • OFC system shares DNN models     • Reduces memory usage                • Complexity                       • Python
           usage
                                                                      • Minimizes cold start delays                                           • ShmFaas
                                                                      • Minimal code changes                                                  • Kubernetes
  [21]     • Memory utilization       • Uses input-aware Random Forest   • Reduce memory allocation            • Limited to specific platforms    • AWS Lambda
                                      Regression
                                                                      • Reduce costs                        • Overhead from monitoring         • Python
                                                                                                                                             • AWS CloudWatch


Accounting Memory Selection (CPU-TAMS) to optimize memory con                           characteristics, such as its direct load/store access, can increase per
figurations for serverless Function-as-a-Service (FaaS) platforms.                       formance but also lead to bottlenecks when multiple threads concur
CPU-TAMS uses CPU time accounting and regression modeling methods                        rently write to it. They propose a PM-aware scheduling system for
to provide recommendations that reduce execution time and costs.                         serverless workloads that optimizes job completion time by managing
Tetiana Zubko et al. [24] presented MAFF (Memory Allocation Frame                       concurrent access and improving efficiency while ensuring fairness
work for FaaS functions), which is a framework to optimize memory                        among applications. Meenakshi Sethunath et al. [30] have proposed a
allocation for serverless functions automatically. MAFF adapts memory                    joint function warm-up and request routing scheme for serverless
settings based on function requirements and employs various algorithms                   computing that optimally utilizes both edge and cloud resources. It
to minimize costs and execution duration. The framework was tested on                    addresses like high latency and cold-start delays by maximizing the hit
AWS Lambda, demonstrating improved performance compared to                               ratio of requests. It reduced latency, by considering memory and budget
existing memory optimization tools. Josef Spillner [25] discussed                        constraints. Anisha Kumari, et al. [31] have proposed a performance
resource management for serverless functions, focusing on memory                         model for optimizing resource allocation in serverless applications,
tracking, profiling, and automatic tuning. The author outlines the issues                addressing like cost estimation and performance evaluation. It in
that developers face in determining memory allocation due to                             troduces a greedy optimization algorithm to improve end-to-end
coarse-grained configurations from cloud providers. Also proposes tools                  response time while considering budget constraints. They utilize serv
to measure memory consumption and dynamically adjust allocations to                      erless applications on AWS to analyze the trade-offs between perfor
reduce waste and costs, and improve performance in                                       mance and cost, demonstrating the model’s effectiveness in optimal
Function-as-a-Service (FaaS) environments.                                               resource configurations. Finally, Table 2 shows a comparison of
    Zengpeng Li et al. [26] have presented algorithms for optimizing                     heuristic-based approaches from related studies.
memory configuration in serverless workflow applications, specifically a
heuristic urgency-based algorithm (UWC) and a meta-heuristic hybrid                      2.3. Framework-based approaches
algorithm (BPSO). These algorithms aim to balance execution time and
cost for serverless applications, to solve the challenges posed by memory                    Anjo Vahldiek-Oberwagner et al. [32] have proposed a Memory-Safe
allocation and performance modeling. Andrea Sabioni et al. [27] have                     Software and Hardware Architecture (MeSHwA) to enhance serverless
introduced a shared memory approach for function chaining on serv                       computing and microservices by leveraging memory-safe languages like
erless platforms and proposed a container-based architecture that in                    Rust and WebAssembly. It aims to reduce infrastructure overheads
creases the efficiency of function composition on the same host. By using                associated with cloud architectures while improving performance and
a message-oriented middleware that operates over shared memory, this                     security through a unified runtime environment that isolates services
approach reduces response latency and improves resource utilization.                     effectively. Divyanshu Saxena, et al. [33] have presented Medes, a
The results show performance improvements in request completion                          serverless computing framework that improves performance and
rates and reduced latency during function execution. Aakanksha Saha                      resource efficiency by introducing a deduplicated sandbox state. This
et al. [28] have presented EMARS, an efficient resource management                       state reduces memory usage by removing redundant memory chunks
system designed for serverless cloud computing, focusing on optimizing                   across sandboxes, allowing for faster function startups and improved
memory allocation for containers. Built on the OpenLambda platform,                      management of warm and cold states. Experiments show that Medes
EMARS uses predictive models based on application workloads to adjust                    under memory pressure can reduce end-to-end latency and cold start. Ji
memory limits dynamically, enhancing resource utilization and                            Li, et al. [34] designed TETRIS, a memory-efficient serverless platform
reducing latency. Experiments demonstrate that tailored memory set                      for deep learning inference. It addresses the memory overconsumption
tings improve performance in serverless functions. Amit Samanta, et al.                  in serverless systems by implementing tensor sharing and runtime
[29] have discussed the issues and opportunities of integrating persis                  optimization. It reduces memory usage while increasing function den
tent memory (PM) into serverless computing. It shows how PM’s unique                     sity. TETRIS automates memory sharing and instance scheduling,

                                                                                     3
Z. Shojaee Rad et al.                                                                                                           Computer Standards & Interfaces 97 (2026) 104098


Table 2
Comparison of heuristic-based approaches.
  Ref      Metric                                     Method                                         Advantage              Disadvantage                  Tools

  [22]     • Memory configuration effectiveness       • Distributed tracing                          • Balances cost and    • Limited to specific         • Python
           based on Service Level Objectives (SLOs)                                                  performance            platforms
                                                    • Max-heap-based optimization algorithm                                                           • AWS
                                                                                                                                                     • SLAM
  [23]     • Efficiency                               • CPU time accounting                          • Reduces runtime      • Limited to specific FaaS    • Python
                                                                                                                            platforms
                                                    • Regression modeling                          • Reduces cost                                      • AWS
                                                                                                                                                     • GCF
  [24]     • Effectiveness for FaaS functions         • Algorithms Linear, Binary, Gradient          • Lower cost           • Require specific function   • Python
                                                      Descent for self-adaptive memory                                      profiling
                                                      optimization
                                                                                                  • Faster execution                                  • AWS
  [25]     • Efficiency                               • Utilizes memory tracing, profiling, and      • Reduce cost          • Requires extensive          • AWS
                                                      autotuning tools                                                      profiling data
          • Cost optimization                                                                      • Improve                                           • Functracer
                                                                                                     performance
                                                                                                  • Autoscaling                                       • Autotuner
                                                                                                     resource
                                                                                                                                                     • costcalculator
  [26]     • Efficiency                               • Heuristic (UWC) and meta-heuristic           • Time-cost tradeoff   • Complexity                  • AWS
                                                      (BPSO) algorithms
                                                                                                  • Optimal workflow     • Computational overhead      • Python
                                                                                                                                                     • UWC
                                                                                                                                                     • BPSO
  [27]     • Response latency                         • Shared-memory, using a message-oriented      • Improve request      • Limited to co-located       • Message-oriented
                                                      middleware                                     completion             functions                     middleware
          • Resource usage                                                                         • Improve rates        • Complexity in managing
                                                                                                     response time          shared memory
                                                                                                  • Optimized resource
                                                                                                     usage
  [28]     • Memory allocation efficiency             • Workload-based model                         • Optimizes resource   • Complexity                  • Open Lambda
                                                                                                     usage
          • Latency                                  • Memory-based model                           • Reduce latency                                    • Python
  [29]     • Throughput                               • Using performance modeling and               • Improve              • Complexity                  • Open FaaS
                                                      admission control for concurrent access        throughput
          • Job completion time (JCT)                                                              • Reduce latency                                    • Intel Optane
                                                                                                                                                          DCPMM
  [30]     • Latency reduction                        • Joint function warm-up                       • Reduces cold-start   • Complexity                  • AWS
                                                                                                     latency
          • Request hit ratio                        • Routing for edge and cloud collaboration     • Optimize             • Dependent on accurate       • Azure Function
                                                                                                     performance            profiling
  [31]     • End to end response time                 • Greedy optimization algorithm                • Reduce latency       • Complexity                  • Amazon Web
                                                                                                                                                          Service
          • Cost                                                                                   • Reduce cost          • Dependent on accurate       • AWS
                                                                                                                            profiling
                                                                                                  • handles cold start
                                                                                                     delay


providing efficient resource utilization without compromising                                 environments, for Function-as-a-Service (FaaS) models using microVMs.
performance.                                                                                  It addresses the issues of memory elasticity, during scaling down, by
    Dmitrii Ustiugov, et al. [35] have discussed cold-start latency in                        segregating memory allocations for individual function instances,
serverless computing and introduces vHive, an open-source framework                           thereby enabling rapid and efficient reclamation without the overhead
for experimentation. It shows the inefficiencies of snapshot-based                            of page migrations. HotMem improves memory management perfor
function invocations, which can result in high execution times due to                         mance, maintaining low latency for function execution. Finally, Table 3
frequent page faults. They propose REAP, that prefetches memory pages                         shows a comparison of framework-based approaches from related
to reduce cold-start delays by 3.7 times, improving performance of                            studies.
serverless functions. Ao Wang, et al. [36] have presented INFINICACHE,
an innovative in-memory object caching system leveraging ephemeral                            3. Background
serverless functions to provide a cost-effective solution for large object
caching in web applications. It shows the system’s ability to achieve cost                       In this section, we explain the concepts of serverless computing and
savings while maintaining high data availability and performance                              then provide explanations about memory configuration in serverless
through techniques like erasure coding and intelligent resource man                          computing.
agement. Anurag Khandelwal et al. [37] have presented Jiffy, an elastic
far-memory system for serverless analytics. It overcomes the limitations                      3.1. Serverless computing
of existing memory allocation systems by enabling fine-grained,
block-level resource allocation, allowing jobs to meet their real-time                            Serverless computing enables developers to concentrate on coding
memory needs. Jiffy minimizes performance degradation and resource                            without the necessity of managing or provisioning servers, which is why
underutilization by dynamically managing memory for individual tasks.                         it’s termed "serverless." Serverless computing provides an efficient and
Orestis Lagkas Nikolos, et al. [38] have introduced HotMem, a mecha                          scalable solution for running programs. Its ease of management and
nism designed to enhance memory reclamation in serverless computing                           lightweight features have made it popular as an implementation model

                                                                                         4
Z. Shojaee Rad et al.                                                                                                         Computer Standards & Interfaces 97 (2026) 104098


Table 3
Comparison of framework-based approaches.
  Ref      Metric                       Method                                                  Advantage                 Disadvantage                        Tools

  [32]     • Performance and            • MeSHwA, a memory-safe software and hardware           • Increase security and   • Complexity                        • Python
           resource efficiency          architecture                                            performance
                                                                                                                                                         • Rust
                                                                                             • Resource sharing                                           • Wasm
  [33]     • End-to-end latency         • Medes, a framework utilizing memory deduplication     • Reduce cold start       • Complexity                        • AWS
                                        to create a new deduplicated sandbox                                                                                  Lambda
          • Memory usage                                                                      • Increase flexibility    • Overhead from deduplication       • Python

                                                                                             • Optimal memory                                             • Open Whisk
                                                                                                                                                         • Open FaaS
                                                                                                                                                         • Function
                                                                                                                                                              Bench
                                                                                                                                                         • CRIU
  [34]     • Function startup latency   • Runtime sharing to reduce memory consumption          • Memory savings          • Complexity                        • Open FaaS
                                                                                             • Reduce cold start       • Tensor management overhead        • TensorFlow
  [35]     • Cold-start latency         • REAP, a mechanism that records and prefetches a       • Reduce cold-start
                                        function’s working set                                  latency
          • Overhead the first         • Python s
           invocation
          • Memory efficiency                                                                                                                             • Kubernete
  [36]     • Latency of function        • INFINICACHE, caching through tensor sharing and       • Cost saving             • Limited to large object caching   • AWS
           invocation                   erasure coding                                                                                                        Lambda
                                                                                             • Data availability       • Relying on the limitations of
                                                                                                                          serverless architecture
  [37]     • Execution time             • Jiffy, an elastic far-memory                          • Improvement execution   • Complexity                        • Amazon EC2
                                                                                                time
          • Memory utilization                                                                • Increase resource       • Relies on specific serverless     • Python
                                                                                                utilization               architecture
                                                                                                                                                         • C++
                                                                                                                                                         • Java
  [38]     • Memory speed               • HotMem, a memory management framework with            • Faster reclaim memory   • Challenges in memory              • Open whisk
                                        rapid reclamation of hotplugged memory                                            management
          • Tail latency                                                                                                                                  • Azure


in cloud computing [39]. Serverless computing is an abstraction of cloud                        when there are fewer concurrent users. This elasticity transforms
computing infrastructure. Serverless computing is a cloud computing                             serverless computing models into a pay-as-value billing model [39].
model in which a cloud provider or a third-party vendor manages the                           • Efficiency and performance: Developers do not need to perform
company’s servers. The company does not need to purchase, install,                              complex tasks like multi-threading, HTTP requests, etc. FaaS forces
host, and manage the servers. Instead, the cloud administrator provides                         developers to focus on building the application rather than config
all of these services.                                                                          uring it.
    Serverless computing, also known as Function-as-a-Service (FaaS),                         • Programming languages: Serverless computing supports many pro
ensures that the code exposed to the developer consists of simple, event-                       gramming languages, including JavaScript, Java, Python, Go, C#,
driven functions. As a result, developers can focus more on writing code                        and Swift [44]. This versatility allows developers to choose the
and delivering innovative solutions, without the hassle of creating test                        language that best suits their project needs and expertise, increasing
environments and provisioning and managing servers for web-based                                productivity and enabling rapid application development.
applications [40]. FaaS and the term “serverless” can be used inter
changeably, with serverless computing being FaaS. This is because the                           Challenges of Serverless Computing:
FaaS platform automatically configures and maintains the context of the
functions and connects them to cloud services without the need for                            • Vendor lock-in: Vendors typically use proprietary technologies to
developers to provide a server [41,42].                                                         enable their serverless services. This can create problems for users
    Features of Serverless Computing:                                                           who want to migrate their workloads to another platform. When
                                                                                                migrating to another provider, changes to the code and application
 • Pay-per-use: In serverless computing, users pay only for the time                            architecture are inevitable [45].
   their code uses dedicated CPU and storage. Pay-per-use pricing                             • Cold start latency: Serverless services can experience a latency
   models reduce costs. But in cloud services, users pay for over                              known as “cold start.” When the service is first started, it takes some
   provisioning of resources like storage and CPU time, which often sit                         time for the service to respond. The reason for this is the initial
   idle [39,43].                                                                                configuration of the cloud service provider and resource allocation
 • Speed: Teams can move from idea to market more quickly because                               and initialization of the infrastructure. This initial delay can be a
   they can focus entirely on coding, testing, and iterating without the                        concern in systems that respond to many requests per second.
   operational costs and server management. There is no need to update                          Methods and techniques are important for mitigating the cold start
   fundamental infrastructure like operating systems or other software                          problem [46–48].
   patches, allowing teams to concentrate on building the best possible                       • Debugging complexity: Debugging serverless functions is difficult
   features without worrying about the underlying infrastructure and                            due to their transient nature. The reason for this problem is that
   resources.                                                                                   Serverless functions typically do not maintain the state of previous
 • Scalability and elasticity: The cloud vendor is responsible for auto                        calls (stateless). Also, serverless functions are stateless by design,
   matically scaling up capabilities and technologies to meet customer                          which can complicate application state management. Also, reports
   demands. Serverless functions should automatically scale down                                on serverless function calls should be sent to the developer. These


                                                                                         5
Z. Shojaee Rad et al.                                                                                             Computer Standards & Interfaces 97 (2026) 104098


   reports should include detailed stack traces so that developers can            3.3. Deep reinforcement learning
   identify the cause of an error. Stack tracing is currently not available
   for serverless computing, meaning developers cannot easily identify                Neural networks were combined with reinforcement learning for the
   the cause of an error [49–51].                                                 first time in 1991 when Gerald Tsauro used reinforcement learning to
 • Architectural complexity: A serverless application may consist of              train a neural network to play backgammon at the master level [54].
   multiple functions. The more functions there are, the more complex             Deep reinforcement learning (DRL) is a combination of reinforcement
   the architecture becomes. This is because each function must be                learning and deep neural networks. In reinforcement learning, an agent,
   properly linked to other functions and services. Also, managing this           while interacting with its environment learns a policy incrementally that
   large number of functions and services can be difficult [52].                  enables it to maximize long-term rewards. This approach enables the
 • Long-running: Serverless computing with long-running functions                 learning of much more complicated policies that are suitable for
   executes the function in a limited and short execution time, while             high-dimensional problems by combining with deep neural networks.
   some tasks may require a long execution time. These functions do not           Fig. 1 shows the elements in a reinforcement learning model.
   support long-running execution because these functions are stateless,              Deep reinforcement learning is used in various fields such as com
   meaning that if the function is stopped, it cannot be resumed [53].            puter games, robotics, resource management in distributed systems, and
                                                                                  performance optimization of cloud computing systems. Deep learning
3.2. Memory configuration                                                         techniques can be used in memory configuration in serverless
                                                                                  computing. And predicted the amount of memory required to run a
    Memory configuration in serverless computing is an important                  serverless application.
aspect that influences application performance, resource efficiency, and              Serverless computing is a computing model in which the cloud ser
cost management. Understanding how to optimally allocate memory for               vice provider manages the infrastructure and server resources. A
serverless functions is important for developers who want to maximize             developer just needs to write his code and upload it onto the serverless
the benefits of serverless computing [40]. In serverless environments,            platform. The advantages of serverless computing include many things:
developers deploy small units of code, known as functions, which are              automatic scalability, cost per resource, ease of management. Deep
executed in response to specific events. Each function operates in a              learning can be used to improve memory configuration in serverless
stateless manner and is allocated a certain amount of memory at run              computing. Deep learning can automatically identify memory usage
time. The memory configuration affects several factors:                           patterns in applications and allocate the required memory based on that.
                                                                                  Benefits of using deep learning for automatic memory configuration in
 • Performance: The memory allocated for a function will determine                serverless computing applications:
   execution time. The more the memory, the faster, as it allows the
   CPU to perform better and cold start latency to reduce. But too less            • Automation: With deep learning, the technology can easily identify
   memory can make it very slow and cause downtime.                                  patterns concerning memory usage in an application and automate
 • Cost efficiency: The pricing of serverless computing is pay-per-use;              the memory needed for allocation. Consequently, this will reduce
   the costs are determined by the amount of resources utilized in                   time spent on memory configuration.
   execution. Memory configuration can help prevent over-allocation of             • Optimization: Deep learning can learn how to optimize memory
   memory, which leads to high costs. However, a lack of memory will                 usage by managing how memory is allocated. It helps to decrease
   lead to performance issues that need to be balanced [40].                         computing costs and enhances the performance of applications.
 • Scalability: Memory configuration is performed in such a way that               • Flexibility: Deep learning can learn from changes in memory usage
   serverless applications will be able to scale seamlessly according to             patterns. It can help in the improvement of an application over time.
   variable workloads. Since the demand fluctuates, dynamically allo
   cating memory helps achieve good performance without incurring                     Deep learning to automate the configuration of memory in serverless
   excessive costs.                                                               computing faces some challenges. First, there is a need for training data.
                                                                                  Deep learning requires enough training data in order to identify patterns
    Memory configuration optimization and automated, data-driven                  in memory usage. Another challenge is the complexity of deep learning,
approaches to memory management in serverless computing increase                  which also requires a lot of time to learn. However, using deep learning
performance and scalability, and help save costs. These methods can               for automating the configuration of memory in serverless computing
allow developers to minimize the challenges of manual memory                      offers numerous benefits. It will improve memory configuration tech
configuration.                                                                    niques and reduce computational costs.


                                                    Fig. 1. Elements in a reinforcement learning model.

                                                                              6
Z. Shojaee Rad et al.                                                                                           Computer Standards & Interfaces 97 (2026) 104098


3.4. Autonomic computing                                                         utilization of these functions. Auto Opt Mem utilizes Deep Reinforce
                                                                                 ment Learning (DRL), a branch of artificial intelligence, to learn an
    The MAPE-K model provides a framework for the management of                  optimal memory configuration policy. DRL enables the system to learn
autonomous and self-adaptive systems [55–57]. The model comprises                from experience and make decisions based on a reward mechanism [15,
five major components: Monitoring, Analysis, Planning, Execution, and            16]. In this regard, Auto Opt Mem learns the memory resource allocation
Knowledge Management, illustrated in Fig. 2.                                     to SFs for maximum performance while minimizing waste of resources.
    Monitoring is where data is gathered on an ongoing basis to measure              By employing DRL, Auto Opt Mem takes into account resource
the system’s current state and functionality relative to predetermined           constraints in the serverless environment, resource requirements of
goals. The subsequent Analysis stage examines this data to bring to light        functions, energy consumption, latency, and deployment costs. The
any differences between the present condition and desired outcomes,              system learns the optimal memory configuration policy through a
giving the information necessary for adjustment. After problem detec            training process that involves interacting with the environment and
tion, the Planning stage creates schemes to amend these disparities,             receiving feedback in the form of rewards. The automatic memory
specifying how the system needs to modify its behavior. The Execution            configuration learning approach in Auto Opt Mem enables the system to
stage then enacts these strategies, modifying the system’s behavior to           dynamically adapt to changing conditions and optimize memory
attain the desired results. Knowledge Management serves as a repository          resource allocation based on the specific needs of each SF. This approach
of knowledge for the important information shared among the various              contributes to efficient resource utilization and improved performance
stages. Overall, the MAPE-K model offers a feedback loop that allows             in serverless computing environments. This work proposes a framework
autonomous systems to adapt to new situations to ensure optimum                  called Auto Opt Mem for automatic optimal memory that uses a deep
performance by constantly monitoring and adjusting. This model is                learning agent to automatically optimize the memory allocated to each
required to build resilient and efficient autonomous systems. In addi           serverless function with respect to computational resources. The pri
tion, autonomous control systems greatly improve the efficiency and              mary goal of Auto Opt Mem, utilizing Deep Reinforcement Learning
reliability of industrial processes by minimizing human error and                (DRL), is to optimize memory allocation for serverless functions. In
ensuring consistent performance. Autonomous control systems are                  other words, the main goal is to dynamically adjust the memory con
closed-loop controls that are preprogrammed to operate autonomously              figurations to become highly performant, cost-effective, and efficient
without the intervention of an operator, and they ensure predictable             regarding resource utilization. Indeed, it seeks to optimize memory for
results in complex industrial settings. Therefore, the integration of            serverless functions to minimize costs and reduce latency while main
automated control systems and sophisticated monitoring techniques                taining or improving Quality of Service (QoS).
makes operations simpler and more reliable, and industrial processes
become a necessity in contemporary manufacturing plants [58].                    4.1.1. Key components
                                                                                     This section describes the elements of the Auto Opt Mem framework
4. Proposed approach                                                             that are essential to the automating memory configuration process.
                                                                                     Environment
    In this section, we will explain our proposed approach in more detail.           The environment is the platform of serverless, such as AWS Lambda
First, we will introduce the framework that utilizes machine learning for        or Google Cloud Functions where a set of functions are deployed and
memory configuration in serverless computing, followed by the pre               executed within. It reflects resource usage information, performance
sentation of a formula and algorithm. Finally, we describe the autono           metrics, and all other relevant status information.
mous memory configuration using deep learning to predict memory                      Agent
setting for serverless computing.                                                    The agent does the job of making decisions about the memory size to
                                                                                 be assigned to each function. It possesses one DRL model that tries to
4.1. Proposed framework                                                          learn through interactions with the environment.
                                                                                     State
    In the proposed solution, Auto Opt Mem introduces an autonomous                  The state St at time t includes:
memory configuration approach based on learning for serverless
computing. The goal of Auto Opt Mem is to address the challenge of                • Current memory allocation for each function.
efficiently distributing serverless functions (SF) in a serverless envi          • Performance metrics (e.g., execution time, latency).
ronment while considering a number of various real-world parameters.              • Number of requests received.
One of the key parameters to consider within Auto Opt Mem is memory               • Cost metrics.
configuration, a term that shows how much of the available memory                 • Specific performance requirements.
resources are to be allocated for the serverless function. Memory
configuration has a direct impact on the performance and resource                   Action
                                                                                    The action At at time t is the choice of memory size from some pre
                                                                                 defined set of configurations for that particular function.
                                                                                    Reward
                                                                                    The reward Rt at time t is the signal that informs feedback and
                                                                                 therefore controls learning. It needs to be defined with the aim of:

                                                                                  • Encourage efficient memory usage.
                                                                                  • Improve performance metrics.
                                                                                  • Maintain or increase Quality of Service (QoS).
                                                                                  • Minimize operational costs.

                                                                                     Policy
                                                                                     Policy is a strategy by which the agent makes decisions to select
                                                                                 actions according to the current state. The DRL model implements the
                                                                                 policy.
               Fig. 2. Autonomic computing (MAPE-K loop) [57].                       Fig. 3 shows the iterative loop diagram in deep reinforcement

                                                                             7
Z. Shojaee Rad et al.                                                                                                Computer Standards & Interfaces 97 (2026) 104098


                                         Fig. 3. Iterative loop diagram in deep reinforcement learning algorithms.


learning algorithms.
                                                                                 Table 4
                                                                                 List of Mathematical Symbols.
4.2. Problem formulation
                                                                                  Definition                                                              Notation

   This section presents a mathematical model for the memory config              A set of serverless functions                                           Fi
uration problem in serverless computing. In the following, we describe            Memory allocated to function F є fi                                     Mi
                                                                                  Delay of function fi with memory Mi                                     Li (Mi)
the used symbol in more details.
                                                                                  Cost of executing function fi with memory Mi                            Ci (Mi)
                                                                                  Memory usage for function fi with memory Mi                             Ui (Mi)
4.2.1. Notation                                                                   Quality of service criterion for function fi with memory Mi             Qi (Mi)
                                                                                  Reward at time t                                                        Rt
 • F: Set of serverless functions                                                 State space                                                             S
                                                                                  State space at time t                                                   St
 • Mi: Memory allocated to function F in fi                                       Space of action                                                         A
 • Li (Mi): Latency of function fi with memory Mi                                 Action A at time t                                                      At
 • Ci (Mi): Cost of executing function fi with memory Mi                          Current memory allocation for each function fi                          Mi,t
 • Ui (Mi): Memory utilization for function fi with memory Mi                     Latency                                                                 Li,t
                                                                                  Cost                                                                    Ci,t
 • Qi (Mi): Quality of Service metric for function fi with memory Mi
                                                                                  Quality of service                                                      Qi,t
 • Rt: Reward at time t                                                           Weighting factor for delay                                              α
                                                                                  Weighting factor for cost                                               β
   The list of mathematical symbols is summarized in Table 4.                     Weighting factor for usage (utilization)                                γ
   The framework and automated approach for learning-based memory                 Weighting factor for service quality                                    δ
                                                                                  Minimum memory size that can be allocated to a function                 Mi,min
configuration in serverless computing is as follows:                              The maximum amount of memory that can be allocated to a function        Mi,max
   State space (S)                                                                Minimum acceptable QoS for the function Fi                              Qmin
                                                                                                                                                            i
   The state St at time t includes:                                               Expected value                                                          E
                                                                                  Policy                                                                  π
 • Current memory allocation Mi,t for each function fi.                           Value function                                                          Vπ
                                                                                  Policy network parameters                                               θ
 • Performance metrics such as latency Li,t , cost Ci,t, and Quality of           Value network parameters                                                ϕ
   Service Qi,t.                                                                  Action-value function                                                   Qπ (s,a)
 • The number of requests received.

   Action space (A)                                                              learning, guiding the agent toward desired behaviors by associating
   The action At at time t involves selecting the memory size Mi,t + 1 for       rewards or penalties based on the outcome of actions. The reward
each function fi.                                                                function Rt is designed to:
   Reward function (R)
   The reward function is an essential component of reinforcement                 • Reduce latency and cost.

                                                                             8
Z. Shojaee Rad et al.                                                                                                    Computer Standards & Interfaces 97 (2026) 104098


 • Decrease memory utilization and increase QoS.                                     (    )
                                                                                   Qi Mi,t ≥ Qmin
                                                                                              i   for alli ∈ F                                                       (6)
   Which is expressed by Eq. (1):                                                     Mi,min and Mi,max are the minimum and maximum memory sizes that
       (
        ∑ (
                                                )                                  can be allocated to the function Fi.
                  (    )     (    )    (     ))        (     )
Rt = −     i∈f
               αLi Mi,t + βCi Mi,t − γU Mi,t      + δQi Mi,t            (1)           The minimum acceptable QoS for the function Fi is. This ensures that
                                                                                   the QoS is not less than a certain threshold.

where (α, β, γ, δ) are the weighting factors for latency, cost, utilization,       4.2.3. Reinforcement learning formula
and QoS, respectively. That weights can adjust the relative importance                  This section explains the mathematical formulas used in the rein
of each component in the reward function. For example, if reducing                 forcement learning component of the Auto Opt Mem framework.
latency is more important than cost, α can be increased relative to β.                  Policy (π)
    Lower latency is desirable; therefore, -αLi (Mi,t) penalizes higher                 The policy π(a|s) is the probability distribution over actions given the
latencies.                                                                         current state. It is the strategy that the decision-maker considers for its
    Lower costs are also preferable; thus, -βCi (Mi,t) penalizes higher            next action in response to the current state.
expenses.                                                                               Value function (Vπ)
    Memory utilization is desired to be efficient. Over-allocation of                   Value function refers to the expected long-term discounted reward,
memory leads to resource waste, while under-allocation can degrade                 which contrasts with short-term rewards (R) as it focuses on the long
performance. γUi (Mi,t) rewards the agent for optimal memory usage.                term. It represents the expected return in the long-term resulting from
    Higher QoS is preferred; therefore, +δQi (Mi,t) rewards better service         the current state s under policy π. The value function is an important
quality.                                                                           concept in reinforcement learning and represents the expected return
    To normalize the reward function for optimizing memory configu                (cumulative reward) starting from state s and following policy π, and is
ration in serverless computing, all components (cost, latency, utiliza            defined by Eq. (7):
tion, and QoS) must be on a common scale. Because they have different                         [                      ]
units (for example, costs in dollars, latency in milliseconds). Below is an                     ∑∞
                                                                                   V π (S) = E      γt Rt |S0 = s, π                                         (7)
approach to achieve this normalization. We will use min-max normali                             t=0
zation for each component. The min-max normalization formula is as
follows, and is defined by Eq. (2):                                                    Where γ is the discount factor.
        x − xmin
xʹ =                                                                    (2)         • Expected value (E): Due to the random nature of the environment, it
       xmax − xmin
                                                                                      is the average or expected return in all possible future states.
                                                                                                         ∑
    Where:                                                                          • Sum of reward: ∞          t
                                                                                                           t=0 γ Rt It represents the Sum of rewards over
                                                                                      time. Rewards are discounted by a factor of γt to prioritize immediate
 • X is the original value,                                                           rewards over distant future rewards.
 • Xmin is the minimum value for that metric,                                       • Discount factor (γ): The discount factor γ (where 0 < γ < 1) de
 • Xmax is the maximum value for that metric.                                         termines the present value of future rewards. A higher γ makes future
                                                                                      rewards more significant, while a lower γ emphasizes immediate
   After normalization, the metrics can be combined into the reward                   rewards.
function without unit conflict, and the final reward formula becomes,               • Policy (π): The policy π is the decision rule that gives the probability
which is expressed by Eq. (3):                                                        of executing action a from a state s.
       ( ( (           )           (    )             (    )       ))
        ∑ Li Mi,t − Lmin         Ci Mi,t − Cmin     Ui Mi,t − Umin
Rt = −        α               +β                − γ                                    Value function is important since it is utilized to measure how good a
         i∈f
                  Lmax − Lmin     Cmax − Cmin        Umax − Umin                   given state is under a given policy. It is also a basis for policy comparison
           (    )                                                                  and for determining optimal actions to maximize the long-term reward.
        Qi Mi,t − Qmin
     +δ
           Qmax − Qmin                                                                 Bellman equation
                                                                     (3)               The Bellman equation provides a recursive decomposition for the
                                                                                   value function, making it a powerful tool for solving reinforcement
    Normalization, in effect, normalizes each component into the range             learning problems. The Bellman equation for the value function is as
of [0, 1], thus making them comparable even if they are from different             follows, and is defined by Eq. (8):
units. Normalizing the reward function components makes them com
parable and hence they can be combined meaningfully during the                     V π (S) = Ea− π [Rt + γV π (S + 1)|St = s, At = a]                                (8)
optimization process. This method increases the robustness of the model                Reasons for using the Bellman equation:
and enhances convergence in reinforcement learning algorithms.
                                                                                    • Recursive nature: The Bellman equation breaks down the value of a
4.2.2. Optimization problem                                                           state into the immediate reward Rt plus the discounted value of the
    First, it is necessary to mathematically formulate the problem in                 next state γVπ(St+1). The recursion makes computation efficient and
terms of optimization and reinforcement learning for the Auto Opt Mem                 forms the foundation for dynamic programming techniques.
algorithm and its solution. The problem of optimization can be done as              • Policy evaluation: By repeatedly updating the value function using
follows, and is expressed by Eq. (4):                                                 the recursive relationship, it aids in evaluating the expected return of
        T
        ∑                                                                             a policy π.
minM          Rt                                                        (4)         • Optimality principle: For finding the optimum policy, the Bellman
        t=0                                                                           optimality equation expresses the relationship between the value of a
    With the following constraints, and is expressed by Eq. (5):                      state and the values of subsequent states. This is used in algorithms
                                                                                      like value iteration and Q-learning for finding the optimal value
Mi,min ≤ Mi,t ≤ Mi,max for alli ∈ F                                     (5)           function.
    Quality of Service constraints, and is expressed by Eq. (6):                    • Simplification of complexity: The use of a recursive approach al
                                                                                      lows the Bellman equation to simplify the calculation of the value

                                                                               9
Z. Shojaee Rad et al.                                                                                                            Computer Standards & Interfaces 97 (2026) 104098


    function for all states, which otherwise would be computationally                           monitoring the number of incoming requests for each function. The
    infeasible in large state spaces.                                                           monitoring phase is important because it records real-time information
                                                                                                that indicates system performance and resource utilization. In short:
   Policy gradient
   The policy gradient method is used to optimize policy π, and is                               • Continuously observe the current state of St, including memory
defined by Eq. (9):                                                                                allocation, performance metrics (latency Li,t , cost Ci,t , utilization
                                                                                                   Ui,t, QoS Qi,t), and incoming request rate.
∇θ J(πθ ) = Es− dπ ,a− πθ [∇θ logπθ (a|s)Qπ (s, a)]                                  (9)
                                                                                                 • Collect data from serverless functions and the environment.
where θ are the parameters of the policy, and Qπ (s,a) is the action-value
function.                                                                                       4.3.2.2. Analyze. In the analysis phase, the system uses the perfor
    The value-action function represents the expected return of taking                          mance metrics collected during the monitoring phase, and the system
action a in state s and then following policy π, providing a basis for                          accepts the performance measurements collected during the monitoring
policy improvement.                                                                             phase. This is achieved by examining the current performance of the
    The Bellman equation and value function are the basis of studying                           serverless functions based on the collected data and calculating a reward
and quantifying long-term impact from these decisions and guide the                             value that controls the learning. The reward function is usually based on
policy to optimal function.                                                                     latency, cost, utilization, and QoS, thus allowing for any inefficiencies or
                                                                                                issues that need to be addressed in subsequent phases. In short:

4.3. Proposed algorithm
                                                                                                 • Evaluate performance metrics Li(Mi), Qi(Mi), Ci(Mi), Ui(Mi).
                                                                                                 • Calculate reward Rt based on the current state.
    The proposed Auto Opt Mem algorithm uses deep reinforcement
learning to learn how to assign serverless functions to compute resources
                                                                                                   Which is expressed by Eq. (10):
efficiently. Incorporating the MAPE loop, the Auto Opt Mem algorithm
                                                                                                       ( ( (          )           (    )             (    )       ))
continuously monitors, analyzes, plans, and executes actions to optimize                                ∑ Li Mi,t − Lmin        Ci Mi,t − Cmin     Ui Mi,t − Umin
memory allocation in an automous and adaptive manner. The process of                            Rt = −       α               +β                − γ
                                                                                                        i∈f
                                                                                                                 Lmax − Lmin     Cmax − Cmin        Umax − Umin
the algorithm is as follows:                                                                              (    )
                                                                                                        Qi Mi,t − Qmin
                                                                                                     +δ
4.3.1. Initialization                                                                                     Qmax − Qmin
    This algorithm is the first step in the deep reinforcement learning                                                                                            (10)
(DRL) process for memory configuration. First, two main networks are
prepared: the Policy Network, which is responsible for choosing the                             4.3.2.3. Plan. In the planning phase, the system decides the next ac
action (e.g., allocating the amount of memory) in each state. This                              tions based on the analysis observations. In this stage, best memory
network initially starts with random parameters because it has no                               allocation decisions are selected based on the policies learned from the
knowledge at the beginning and gradually learns to make optimal de                             deep reinforcement learning model and updates in policy network pa
cisions. The Value Network estimates the long-term value of a state                             rameters for enhanced future decision-making. This planning assists the
based on the sum of future rewards. This network also starts with                               system in adjusting its memory allocation policies effectively based on
random parameters. Then, the initial state of the environment (S0) is                           the current situation and performance analysis. It can be said that:
defined, which includes the memory configuration of each function,
performance indicators (latency, cost, QoS, and resource utilization),                           • Use policy network πθ to determine the optimal action At (memory
and the rate of incoming requests. This step is the basis of training and                          allocation for the next time step).
provides the basis for the agent’s interaction with the environment.                             • Update policy and value networks using reinforcement learning
    This initialization is very important, as it sets the starting point for                       techniques. Typically involves:
training the networks to learn optimal memory allocation for serverless                            ○ Calculating the policy gradient using the policy gradient method.

functions, with the aim of minimizing cost and latency and maintaining                             ○ Updating the policy network parameters θ.

high QoS. This is shown in Algorithm 1.                                                            ○ Using the Bellman equation to update the network parameters ϕ.


4.3.2. MAPE loop                                                                                4.3.2.4. Execution. The execution phase is responsible for carrying out
    In this section, we describe an autonomous memory configuration                             the actions that have been planned. It allocates memory to every func
includes four phases: monitoring, analysis, planning, and execution,                            tion according to the decisions of the planning phase and changes to the
                                                                                                next state, ready to trigger the MAPE cycle once more. Its role is to
4.3.2.1. Monitor. In the monitoring phase, the system monitors the                              realize the change from planning and analysis, optimize resource utili
current state of the environment at all times, and the system is always                         zation, and improve the system as a whole. In summary:
aware of the state of the environment. This includes monitoring the
memory allocated to each serverless function, obtaining performance                              • Apply the selected action to adjust the memory allocation for each
data such as latency, cost, utilization, and quality of service (QoS), and                         function.
                                                                                                 • Go to the next state St+1 and repeat the loop.
Algorithm 1
Pseudo code for initialization phase ( ).                                                           The MAPE loop enables a dynamic and automated way of memory
  1: Input: Set of serverless functions F
                                                                                                management in serverless computing systems, where the system can
  2: Output: Initialized policy network πθ, value network Vϕ, and initial state S0              learn and change with new situations on a continuous basis, eventually
  3: Initialize policy network πθ with random parameters θ                                      resulting in enhanced performance and resource efficiency, as shown in
  4: Initialize value network Vϕ with random parameters ϕ                                       Algorithm 2.
  5: Define initial system state S0 including:
  6: - Memory allocation for each fi ∈ F
  7: - Performance metrics: Latency Li(Mi), Cost Ci(Mi), Utilization Ui(Mi), QoS Qi(Mi)         4.3.3. Training loop
  8: - Incoming request rate                                                                        The training loop is an important part of the Auto Opt Mem algo
  9: Return (πθ, Vϕ, S0)                                                                        rithm, which uses deep reinforcement learning to improve how memory


                                                                                           10
Z. Shojaee Rad et al.                                                                                                       Computer Standards & Interfaces 97 (2026) 104098


is allocated for serverless functions. The process starts by resetting the               Although AWS Lambda and CloudWatch were used conceptually to
environment to its initial state, called S0. For each time step t, the al           structure the resource and metric model, all experiments were executed
gorithm goes through the MAPE loop. It begins by checking the current                in a simulated environment. As shown in Table 5.
state St, gathering data about memory usage, performance metrics, and                    The experiments involved tuning a set of hyperparameters. The pa
other relevant information. After collecting this data, it analyzes it to see        rameters were:
how well the system is performing and calculates a reward that helps                     Learning rate: Different learning rates like 0.01, 0.001 and 0.0001
guide future learning. Next, the algorithm plans the next action by                  were attempted during training.
choosing the best way to allocate memory using its policy network. This                  Discount Factor (γ): Discounting coefficient of 0.99 was chosen to
decision is based on the insights gained from the analysis. Once it de              give higher importance to long-term rewards for the reinforcement
cides on an action, the system carries it out, adjusting the memory for              learning.
each function as needed. Finally, the algorithm moves to the next state                  Batch size: A batch size of 32 was utilized in training the deep
St+1 and prepares to start the loop again. This ongoing process allows the           learning models, a trade-off between training time and model accuracy.
system to learn and adapt continuously, refining its memory allocation                   Number of episodes: Training was carried out over 100 episodes,
strategies based on real-time feedback. Each cycle helps improve the                 where the agent learns through experience interacting with the envi
overall performance and efficiency of memory management in the                       ronment and tunes its memory allocation policies.
serverless environment. In summary, it can be said:                                      Both the policy and value networks are implemented as multilayer
                                                                                     perceptrons (MLPs). The input layer receives the state vector, which
1. Environment reset: Reset the environment to its initial state S0.                 includes memory allocation, latency, cost, utilization, QoS, and request
2. MAPE execution: For each time step t:                                             rate. Each network has two fully connected hidden layers with 128 and
   • Monitor: Observe the current state St.                                          64 neurons, respectively, and ReLU activation functions. The policy
   • Analyze: Evaluate performance and calculate reward.                             network ends with a softmax output layer producing a probability dis
   • Plan: Select an action using the policy network.                                tribution over possible memory allocations, while the value network has
   • Execute: Execute the action and move to the next state St+1.                    a single linear output estimating the state value. The generated experi
                                                                                     ence data was split into 70 % for training and 30 % for evaluation, which
    Algorithm 3 shows the execution phase of the MAPE loop during the                is standard in DRL-based optimization studies.
training process.

5. Performance evaluation                                                            5.2. Performance metrics

    In this section, we present the performance evaluation of the novel                 To evaluate the effectiveness of the proposed approach, we utilize
automatic deep learning-based approach (Auto Opt Mem) for memory                     several performance metrics, including:
setting in serverless computing. We describe the experimental setup,                    Latency: The time taken for a function to execute, measured in
performance metrics, and the result of the experiments.                              milliseconds. Lower latency indicates better performance. Which is
                                                                                     expressed by Eq. (11):
5.1. Experimental setup                                                                   Tend − Tstart
                                                                                     L=                                                                               (11)
                                                                                                N
    We carried out experimental analysis in this study on a Windows
11–64-bit computer with an Intel Core i7 processor. The evaluation uses                 Where and are the Tstart and Tend times of execution, and N is the
a serverless simulation environment, where memory sizes between 128                  number of function invocations.
MB and 2048 MB are modeled to study their impact on latency, cost, and                  Cost: The total cost incurred during function execution, measured in
quality of service (QoS). A virtual CPU (vCPU) model with burst                      US dollars. Our goal is to minimize this cost. Which is expressed by Eq.
behavior and dynamic workload scaling is incorporated into the simu                 (12):
lation. Python was utilized to implement deep reinforcement learning                      N
                                                                                          ∑
algorithms in machine learning. Proximal Policy Optimization algo                   C=         (Mi × Pmem + Ti × Pexec )                                             (12)
                                                                                          i=1
rithm was utilized to train the deep reinforcement learning agent as it is
known to be stable and efficient in policy optimization procedures. Also,                Where Mi is the allocated memory, Pmem is the price per MB, Ti is the
the reason for choosing the Proximal Policy Optimization (PPO) algo                 execution time, and Pexec is the execution price per second.
rithm is that it is widely suitable for continuous control and policy                    Quality of Service (QoS): A composite score reflecting the reli
optimization problems in environments with large dynamic states such                 ability and user satisfaction of the service. A composite metric based on
as Serverless environments. And it has higher stability in training than             latency and availability, defined as, and is expressed by Eq. (13):
algorithms such as REINFORCE or Vanilla Policy Gradient. It has the
                                                                                                   1
ability to control the trade-off between exploration and exploitation in             QoS =                                                                            (13)
                                                                                             L + δ(1 − A)
dynamic environments where the workload changes randomly. It is
suitable and scalable in environments with high-dimensional state                        Where A is the availability factor (percentage of successful execu
spaces where multiple parameters such as latency, cost, utilization, and             tion) and helps the quality of service to take into account the impact of
quality of service (QoS) need to be optimized simultaneously. The                    availability on the overall system performance. δ is a weighting
workloads used in our evaluation consisted of four modeled serverless                parameter.
scenarios: ML inference, API aggregation, data preprocessing (ETL), and                  Utilization: The efficiency of memory usage during function
video processing. These workloads are designed to emulate the perfor                execution, aiming for optimal allocation without wastage. Which is
mance behavior of serverless applications while being executed in a fully            expressed by Eq. (14):
simulated environment. Furthermore, all experiments are implemented
                                                                                           Mused
in Python language, and the source code of simulation can be down                   U=              × 100                                                            (14)
                                                                                          Mallocated
loaded at the GitHub repository.1
                                                                                        Where Mused is the actual memory usage and Mallocated is the assigned
                                                                                     memory. A higher value indicates optimal management of memory
  1
      https://github.com/zahrashj-rad/Auto-Opt-Mem                                   resources.

                                                                                11
Z. Shojaee Rad et al.                                                                                                            Computer Standards & Interfaces 97 (2026) 104098


Algorithm 2                                                                                     the model’s weights. We tested different learning rates to examine their
Pseudo code for MAPE loop phase ( ).                                                            impact on Auto Opt Mem’s efficiency.
  1: Input: Current state St, networks (πθ, Vϕ)                                                     In this experiment, different learning rates were implemented for the
  2: Output: Updated state St+1                                                                 problem of prediction and memory regulation in serverless environ
  3: Monitor:                                                                                   ments, and a reinforcement learning agent based on policy gradient was
  4: Collect metrics (Latency Li,t, Cost Ci,t, Utilization Ui,t, QoS Qi,t, Requests)
                                                                                                implemented. The goal was to show the impact of learning rate on
  5: Analyze:
  6: Compute reward Rt using the reward function                                                memory allocation policy learning and ultimately on the performance
  7:       Rt = − (αLi(Mi,t) + βCi(Mi,t) − γUi(Mi,t)) + δQi(Mi,t)                               metrics Latency, Cost, QoS, and Utilization. Three learning rate values
  8:       Normalize all components (cost, latency, utilization, QoS)                           were tested: 0.01, 0.001, and 0.0001; each experiment was performed
  9:     using:                                                                                 on 100 episodes. The environment model and the relationships between
  10:        X’ = (X − Xmin) / (Xmax − Xmin)
                                                                                                memory and metrics were implemented by a noisy function to represent
  11: Plan:
  12:    Select action At = πθ (St)                                                             the overall system behavior and the impact of memory selection on
  13:     Update policy network πθ using policy gradient                                        metrics.
  14:        ∇θ J(πθ) = E₍s-dπ, a-πθ₎ [ ∇θ log πθ(a|s) Qπθ(s,a) ]                                   The learning rate can affect the performance metrics as follows:
  15: Update value network Vϕ using Bellman equation
                                                                                                    Latency: A very high learning rate may destabilize training, pre
  16:        Vπ(S) = E a-π [ Rₜ + γ Vπ (S + 1) | Sₜ = s, Aₜ = a ]
  17: Execute:
                                                                                                venting the model from converging to an optimal policy. This instability
  18:     Apply action At (update memory allocation Mi,t + 1)                                   leads to fluctuations in memory allocation and, consequently, higher
  19:     Transition to next state St+1                                                         execution latency. Conversely, a moderate learning rate supports stable
  20: Return St+1                                                                               convergence, resulting in lower latency.
                                                                                                    Cost: If memory allocation decisions are unstable due to an exces
                                                                                                sively high learning rate, resource usage can increase, raising the
Algorithm 3                                                                                     execution cost. However, in some cases, a higher learning rate can
Pseudo code for MAPE Execution phase ( ).                                                       accelerate convergence to an efficient allocation, thus reducing costs.
  1: Input: Environment, networks (πθ, Vϕ)                                                      The impact on cost is therefore dual, depending on whether training
  2: Output: Optimized memory allocation policy                                                 stabilizes.
  3: Initialize environment and obtain initial state S0                                             Quality of Service (QoS): High learning rates may cause instability
  4: For each episode do
                                                                                                in memory allocation policies, leading to inconsistent performance and
  5: For each time step t do
  6:         Run MAPE Loop with input St                                                        degraded QoS. In contrast, a well-tuned moderate learning rate achieves
  7:         Observe reward Rt and next state St+1                                              more stable optimization and improved QoS.
  8:         Store tuple (St, At, Rt, St+1)                                                         Utilization: Rapid but unstable adjustments caused by a high
  9:      Update πθ and Vϕ based on (St, At, Rt, St+1)                                          learning rate can lead to inefficient memory allocations, resulting in
  10:         Policy network update
  11:          ∇θ J(πθ) = E₍s-dπ, a-πθ₎ [ ∇θ log πθ(a|s) Qπθ(s,a) ]
                                                                                                either under-utilization or over-utilization of resources. A balanced
  12:          Value network update                                                             learning rate is more likely to achieve efficient utilization. Fig. 4 shows
  13:         Vπ(S) = E a-π [ Rₜ + γ Vπ (S + 1) | Sₜ = s, Aₜ = a ]                              the learning rates and results.
  14:      Set St ← St+1                                                                            The cost and quality of service results obtained from training the
  15: End For
                                                                                                agent with different learning rates showed:
  16: End For
  17: Return optimized policy πθ                                                                    The intermediate learning rate LR = 0.001 shows the lowest latency
                                                                                                and highest QoS among the three values. In contrast, the cost increases
                                                                                                significantly and the utilization decreases. In fact, the agent with LR =
                                                                                                0.001 learns faster and more significantly policies that favor higher
Table 5
Tools and technologies.
                                                                                                memory (or policies that make allocations that reduce latency) this leads
                                                                                                to improved QoS but increases the cost per execution unit. The inter
  Component                         Description
                                                                                                mediate rate allows the agent to accept weight changes strongly enough
  Cloud Provider                    AWS Lambda (conceptual model), implemented as a             to reach regions with lower latency (but may be cost-ineffective).
                                    simulated environment in code.                                  The larger learning rate LR = 0.01 shows very low cost and very high
  Programming Language              Python, for implementing the deep reinforcement
                                    learning algorithms.
                                                                                                utilization, but latency and QoS remain at moderate levels. In fact, high
  Deep Learning Framework           TensorFlow/Keras, for building and training the deep        learning rates usually make large updates; In this simulation, the agent
                                    learning models.                                            has arrived at a policy that keeps the cost low (e.g., choosing low or
  Reinforcement Learning            Proximal Policy Optimization (PPO)                          average memories) while maintaining high utilization. This could mean
   Algorithm
                                                                                                learning a cost-saving policy; however, this policy may cause fluctua
  Monitoring Tools                  Simulated monitoring module (MAPE-K), no real
                                    CloudWatch data used.                                       tions and not reach the optimal latency. It is also possible for the agent to
  Data Management                   Pandas, for data manipulation and analysis of the           get stuck in a local boundary (with low cost) under noisy gradients.
                                    performance metrics.                                            A smaller learning rate of LR = 0.0001 had intermediate results
  Development Environment           Jupyter Notebook / Google Colab for interactive             (latency and QoS close to LR=0.01 but average cost). In fact, a too small
                                    development and experimentation.
                                                                                                rate leads to slow and stable learning; the agent may not have fully
                                                                                                converged yet and not have seen significant improvement by the end.
5.3. Experimental results                                                                           Table 6 shows impact of learning rate effects on performance
                                                                                                metrics.
    We evaluated our proposed approach with baseline methods,
including machine learning-based approaches, the impact of learning                             5.3.2. Second scenario: reward function formula based on MAPE loop
rate and reward function.                                                                          To further analyze the impact of Auto Opt Mem, we conducted eight
                                                                                                experiments and calculated the reward function. As explained in the
5.3.1. First scenario: impact of learning rate on optimization                                  Section 4, the reward function is calculated from the following formula.
    One of the important hyperparameters in deep reinforcement
learning is the learning rate (LR), which controls the extent of updates to


                                                                                           12
Z. Shojaee Rad et al.                                                                                             Computer Standards & Interfaces 97 (2026) 104098

       ( ( (          )           (    )             (    )       ))
        ∑ Li Mi,t − Lmin        Ci Mi,t − Cmin     Ui Mi,t − Umin                     We calculate the reward function for each experiment.
Rt = −       α               +β                − γ                                    Experiment 1:
        i∈f
                 Lmax − Lmin     Cmax − Cmin        Umax − Umin
                                                                                          (                                 )
          (    )
        Qi Mi,t − Qmin                                                                      90 − 80 0.17 − 0.15 75 − 60         66 − 40
                                                                                   R1 = −            +            −           +
     +δ
          Qmax − Qmin                                                                      100 − 80 0.50 − 0.15 80 − 60         70 − 40

    Where α, β, γ, and δ are the weights for each variable. The negative at        R1 = − (0.5 + 0.1333 − 0.75) + 0.8667 = − ( − 0.1167) = 0.1167
the beginning of the formula shows that we want to include the negative               The reward function for the remaining experiments is calculated
impact of cost and latency in the reward function. Quality of service (Qi)         similarly.
is considered positive and its positive impact is included in the reward              These findings strengthen the ability of Auto Opt Mem to optimize
function. Adding up the metrics, we would like to consider the sum of              execution time while reducing costs, and it has better performance. The
the influence of all functions. To calculate the reward function, the              reward function is analyzed as follows.
minimum and maximum values are given below:
Lmin = 80, Lmax = 100                                                               • Increasing QoS and Utilization have increased the reward, because
                                                                                      the system has better efficiency.
Cmin = 0.15, Cmax = 0.50                                                            • Reducing latency and cost has a positive effect on the reward, which
                                                                                      indicates more optimal performance.
Umin = 60, Umax = 80                                                                • The highest reward value is observed in Experiment 8 (0.91), which
                                                                                      indicates the optimal balance between QoS, Utilization, latency, and
Qmin = 40, Qmax = 70                                                                  cost.
                                                                                    • The lowest reward value is recorded in Experiment 6 (0.03), which is
   The reward function for different data in the experiment is shown in
                                                                                      due to the increase in latency and the decrease in system efficiency
Table 7, (Latency in ms, Cost in USD, QoS %, Utilization %, Reward
                                                                                      (Utilization).
score).


                                                  Fig. 4. Impact of learning rate on performance metrics.

                                                                              13
Z. Shojaee Rad et al.                                                                                                             Computer Standards & Interfaces 97 (2026) 104098


Table 6
Comparison of learning rate effects on performance metrics.
  Learning Rate (LR)             Latency                                                               Cost                    QoS                        Utilization

  • 0.01 (high)                  • Relatively high, unstable, converges to moderate level              • Lowest cost           • Moderate QoS             • Highest utilization
  • 0.001 (medium)               • Lowest latency (best)                                               • Highest cost          • Highest QoS              • Reduced utilization
  • 0.0001 (low)                 • Moderate, slow convergence                                          • Moderate cost         • Moderate QoS             • Moderate utilization


    Fig. 5 shows the reward function graph. X-axis: Experiment ID (1–8).                         but lacking real-time optimization capabilities. Auto Opt Mem in
Y-axis: Reward value (normalized)                                                                tegrates reinforcement learning to dynamically adjust memory alloca
    Fig. 6 shows Comparison of reward function and other metrics in                              tions, ensuring optimal performance across diverse execution
different experiments.                                                                           environments without requiring manual intervention.
                                                                                                     This section assesses the proposed Auto Opt Mem in a realistic
 • Comparison of latency and reward: Fig. 6a shows that decreasing                               serverless simulation environment. Four workloads were modeled as
   latency generally leads to increasing reward. The reward value is                             representatives, including ML inference, API gateway, data processing,
   higher at lower latencies (such as 84 and 85 milliseconds), but de                           and video processing, each independently defined by its memory range,
   creases at higher latencies.                                                                  baseline latency, and cost functions. A neural network with two hidden
 • Comparison of cost and reward: Fig. 6b shows that while costs may                             layers (32 and 16 neurons) was used to train the PPO agent for 80 epi
   be decreasing, the rewards are increasing, indicating a potentially                           sodes. The reward function incorporates normalized latency, cost, and
   favorable outcome for the experiments.                                                        QoS terms that drive the policy to achieve a balanced optimization. An
 • Comparison of quality of service and reward: Fig. 6c shows that                               autonomic MAPE-K (Monitor, Analyze, Plan, Execute – Knowledge) loop
   increasing QoS usually increases reward. Quality of service in the                            was implemented at runtime to make the system self-adaptive. MAPE
   range of 70–72 % has the highest reward value.                                                continuously monitors the recent performance, analyzes QoS/cost
 • Comparison of utilization and reward: Fig. 6d shows that higher                               trends, plans corrective actions such as tuning the memory, and executes
   utilization usually leads to increased reward. The reward value in                           them through the adjustment of PPO policy parameters. The results are
   creases sharply for values of 80 % and above.                                                 summarized in Table 8.
                                                                                                     Auto Opt Mem demonstrates superior improvements across all
    These graphs show that the optimized system tends to reduce la                              metrics compared to existing research, establishing its effectiveness.
tency, reduce cost, increase quality of service, and improve efficiency or                       Fig. 7 shows the Comparison with previous studies.
utilization to obtain maximum reward. The suggested reward function,                                 Table 8 reports the average performance achieved by Auto Opt Mem
which incorporates delay, cost, utilization, and QoS, can effectively                            across all benchmark functions in terms of latency reduction, cost sav
balance the system’s various goals. Experiment 8 with the highest                                ings, and QoS improvement. These values are directly compared with
reward of 0.91 depicts the optimal balance between all the objectives.                           the results of Sizeless [17] and FnCapacitor [18]. Results demonstrate
Experiment 6 with the lowest reward of 0.03 is worse due to the high                             that, on average, the proposed Auto Opt Mem achieves 25–30 % less
delay and low utilization. The comparison graphs in Fig. 6 clearly show                          latency, 15–18 % less cost, and 10–12 % more QoS when compared to
that the improvement in each of the criteria, such as reducing delay and                         the Sizeless [17] baseline, while outperforming FnCapacitor [18] in all
cost, or increasing QoS and utilization, leads to an increase in reward.                         key metrics. This proves that PPO-based MAPE-K autonomic loop can
This means that the reward function design is suitable and can be used as                        dynamically adjust for the workload variability and provide optimized
a criterion for multi-objective optimization.                                                    resource allocation in serverless environments.
                                                                                                     The results of 8 experiments conducted with different metrics are
5.3.3. Third scenario: comparison with previous studies                                          presented in Table 9. Synthetic yet reproducible data were used, and the
    To validate our results, we compared Auto Opt Mem with two papers                            dataset is openly provided for replication.
[17] and [18] that tackled memory optimization in serverless                                         Table 10 summarizes the statistical comparison of Auto Opt Mem
computing. When compared to the works of Eismann et al. [17] and                                 with the baseline methods. For each metric, the mean, standard devia
Jindal et al. [18], Auto Opt Mem achieves a more substantial reduction                           tion, and 95 % confidence interval were calculated. For instance, Auto
in execution latency, a greater cost reduction, and a significant                                Opt Mem has a latency of 267.92 ms ± 1.65 with a 95 % CI of (266.74,
improvement in QoS. Unlike previous studies that primarily relied on                             269.10), outperforming both Sizeless [17] and FnCapacitor [18].
static function profiling or statistical estimations, Auto Opt Mem                                   For statistical significance, we use an independent two-sample t-test.
continuously learns and adapts to varying workloads, and thus is more                            Indeed, our results indicate that the improvements of Auto Opt Mem
adaptive and scalable. Eismann et al. [17] developed a model for                                 over Sizeless [17] are statistically significant for all metrics (p < 0.02).
resource prediction based on monitoring a single memory size,                                    For FnCapacitor [18], the cost differences are significant with p = 0.035.
achieving performance gains but with limited adaptability to dynamic                                 The statistical analysis confirms that Auto Opt Mem provides
workloads. Jindal et al. [18] introduced a statistical and deep learning                         consistently better performance, with several improvements being sig
approach to estimate function capacity, improving resource efficiency                            nificant at the 95 % confidence level.
                                                                                                     The average latency, average cost, and average quality of service in
                                                                                                 different methods are shown in the graph in Fig. 8.
Table 7                                                                                              In Fig. 8a, Auto Opt Mem (green line) has the lowest latency in all
Reward function values in different experiments.                                                 tests, indicating better optimization in memory allocation and faster
  Experiment      Latency (ms)   Cost ($)    QoS ( %)     Utilization ( %)    Reward             execution of serverless functions.
  1               90             0.17        66           75                  0.11                   In Fig. 8b, right, Auto Opt Mem minimizes the cost of executing
  2               88             0.16        63           78                  0.43               functions in all experiments. Jindal et al. [18] method reduces the cost
  3               92             0.18        67           74                  0.8                compared to Eismann et al. [17] but is still higher than Auto Opt Mem.
  4               87             0.15        70           80                  0.65
                                                                                                     In Fig. 8c, Auto Opt Mem (green line) has the highest Quality of
  5               89             0.17        64           76                  0.21
  6               91             0.16        68           73                  0.03               Service (QoS), indicating increased reliability and optimal performance
  7               86             0.15        71           79                  0.65               under different workload conditions. Jindal et al. [18] method provides
  8               85             0.14        72           82                  0.91               better QoS than Eismann et al. [17] but still falls short of Auto Opt


                                                                                            14
Z. Shojaee Rad et al.                                                                                              Computer Standards & Interfaces 97 (2026) 104098


                                                Fig. 5. Reward function values for eight experimental runs.


Mem’s. Table 11 further illustrates the differences of our approach from           with regards to, for example, memory-to-vCPU coupling and pricing
previous methods.                                                                  scheme, which closely aligns with Azure Functions and Google Cloud
    As compared to static allocation strategies, Auto Opt Mem saves a lot          Functions. Auto Opt Mem is provider-agnostic since platform-specific
of resource wastage by memory allocation according to actual function              constraints, such as memory ranges and CPU scaling rules, are
requirements rather than over-provisioning. Such dynamic allocation                embedded in the state representation. Thus, even though execution was
leads to cost saving in a large extent, especially in the case of varying          simulated, the framework is transferable to real AWS Lambda de
workload or mixed kinds of functions. Moreover, the ability of Auto Opt            ployments and other cloud providers as well.
Mem to minimize latency leads to improved application performance
and user experience. By efficient memory management and reducing                   6.2. Potential application scenarios
cold start time, Auto Opt Mem promises that functions will execute both
quickly and reliably even in high-demand situations. In contrast to                    Due to resource and space limitations, we used controlled micro-
heuristic methods that typically operate on oversimplification assump             benchmarks. Still, Auto Opt Mem is not restricted to this setup and
tions and struggle with dynamic states, Auto Opt Mem employs deep                  can also be used in real applications. For example, it helps with machine
reinforcement learning to discover decisions from real current states.             learning inference tasks, such as running image or text classification
Such responsiveness is important in numerous serverless computing                  models where reducing cost and latency is important. It is also useful for
applications. While machine learning-based methods are somewhat                    video transcoding, since converting formats with tools such as ffmpeg
flexible, Auto Opt Mem is better at balancing conflicting objectives, e.g.,        usually takes a lot of CPU and memory. Another case is data pre
cost minimization and optimal performance. By means of a reward                    processing, where large datasets need to be read, compressed, or
function that considers several parameters, Auto Opt Mem gives an                  filtered. Auto Opt Mem also works well in API aggregation, when several
overall optimization of the system.                                                external services are called at the same time and the process is mostly I/
                                                                                   O-bound.
6. Discussion                                                                          These examples illustrate that Auto Opt Mem is workload-agnostic
                                                                                   and can be applied to real-world scenarios. A full-scale practical eval
   In this section, we discuss about the AWS lambda platform, potential            uation is considered for future research. Table 12 shows real-world
application scenarios, and then provide explanations about the                     scenarios where Auto Opt Mem can be used.
comparative analysis on the memory configuration in serverless
computing.
                                                                                   6.3. Comparative analysis

6.1. Use of AWS                                                                        Table 13 summarizes recent research in intelligent cloud computing
                                                                                   that focuses on automation, optimization, and deep learning. This table
     The experiments were conducted in a simulated environment; the                shows how the proposed Auto Opt Mem framework aligns with these
workload behavior, latency patterns, and cost models were derived from             studies and focuses on dynamic memory and performance optimization
documented AWS Lambda configuration rules. This ensures that the                   in serverless environments.
characteristics of a real AWS serverless environment are captured while
still allowing full reproducibility.                                               7. Conclusions
     AWS Lambda serves as the conceptual reference platform due to its
very high market share and representative resource allocation model                   Memory configuration in serverless computing can be challenging


                                                                              15
Z. Shojaee Rad et al.                                                                                             Computer Standards & Interfaces 97 (2026) 104098


                                                   Fig. 6. Comparison of reward function and other metrics.


                                                                                   as Auto Opt Mem. The results show that Auto Opt Mem optimizes
Table 8
                                                                                   resource utilization, decreases operation costs and latency, and en
Comparison with previous studies.
                                                                                   hances quality of service (QoS), and hence it can be a perfect fit for
  Approach              Latency          Cost Reduction    QoS Improvement         developers in serverless systems. Auto Opt Mem shows noticeable im
                        Reduction ( %)   ( %)              ( %)
                                                                                   provements over previous methods. Compared to Sizeless, it reduces
  Auto Opt Mem vs       25–30 %          15–18 %           10–12 %                 latency by 25–30 %, lowers cost by 15–18 %, and improves QoS by
   Sizeless [17]
                                                                                   10–12 %. Against FnCapacitor, it achieves 5–7 % latency reduction, 6–8
  Auto Opt Mem vs       5–7 %            6–8 %             2–3 %
   FnCapacitor [18]                                                                % cost reduction, and 2–3 % QoS improvement. Our experiments
                                                                                   demonstrate that Auto Opt Mem On average provides 16.8 % lower la
                                                                                   tency, 11.8 % cost reduction, and 6.8 % QoS improvement across both
due to the ephemeral nature of serverless functions, which are short-              methods.
lived and stateless. This research examines memory configuration                       In our approach one of the important hyperparameters in deep
mechanisms and then classifies these mechanisms in serverless                      reinforcement learning is the learning rate, which controls the extent of
computing into three main approaches: machine learning-based,                      updates to the model’s weights. A high learning rate typically causes the
exploration-based, and framework-based approaches. The advantages                  model to update its weights more aggressively. A higher learning rate
and disadvantages of each mechanism, as well as the challenges and                 may cause inefficient memory usage because of unstable learning,
performance metrics affecting their effectiveness, are discussed. Mem             leading to suboptimal allocation. But, a very small learning rate could
ory configuration is one of the important challenges in serverless                 lead to very slow training, Long-term training and slow convergence,
computing; In this paper, we propose an autonomous deep learning-                  leading to late optimization of memory. It can be said that a high
based serverless computing memory optimization system, referred to                 learning rate can achieve convergence quickly, but it may pass the


                                                                              16
Z. Shojaee Rad et al.                                                                                                         Computer Standards & Interfaces 97 (2026) 104098


                                                                     Fig. 7. Comparison with previous studies.


                                                                                             real-time adaptability under more diverse and large-scale workloads.
Table 9
                                                                                             Additionally, the scalability of Auto Opt Mem can also be researched on
Comparison of approaches across 8 experiments.
                                                                                             other serverless frameworks except AWS Lambda.
  Approach          Latency (ms)             Cost ($)                  QoS ( %)

  Sizeless [17]     362.85, 370.14,          0.00243, 0.00236,         80.15, 81.24,         Funding
                    355.92, 380.18,          0.00248, 0.00244,         79.87, 80.45,
                    364.77, 359.60,          0.00241, 0.00237,         81.06, 80.83,
                                                                                                 Funding was received for this work.
                    372.31, 368.54           0.00246, 0.00242          79.92, 80.61
  FnCapacitor       278.42, 271.66,          0.00256, 0.00252,         89.44, 90.22,
                                                                                                 All of the sources of funding for the work described in this publica
    [18]            282.10, 276.74,          0.00257, 0.00255,         88.97, 89.75,         tion are acknowledged below:
                    274.89, 280.13,          0.00251, 0.00253,         90.13, 89.61,             [List funding sources and their role in study design, data analysis, and
                    273.80, 275.55           0.00259, 0.00254          89.88, 90.06          result interpretation]
  Auto Opt          267.91, 270.34,          0.00221, 0.00219,         91.08, 91.46,
                                                                                                 No funding was received for this work.
   Mem              265.10, 268.05,          0.00223, 0.00220,         90.83, 91.21,
                    266.83, 269.14,          0.00218, 0.00222,         91.37, 90.97,
                    267.54, 266.41           0.00221, 0.00220          91.32, 91.15          Intellectual property

                                                                                                 We confirm that we have given due consideration to the protection of
Table 10                                                                                     intellectual property associated with this work and that there are no
Statistical comparison of Auto Opt Mem with baseline methods.                                impediments to publication, including the timing of publication, with
                                                                                             respect to intellectual property. In so doing we confirm that we have
  Approach              Metric     Average      Std.       95 % CI           p-value
                                                Dev.                                         followed the regulations of our institutions concerning intellectual
                                                                                             property.
                                                                        vs. Auto
                                                                             Opt Mem
  Sizeless [17]         Latency    366.29       8.18       (360.43,          0.002           Research ethics
                        (ms)                               372.15)
                       Cost ($)   0.00243      0.00004    (0.00240,         0.018               We further confirm that any aspect of the work covered in this
                                                           0.00246)
                                                                                             manuscript that has involved human patients has been conducted with
                       QoS ( %)   80.52        0.43       (80.22,           0.001
                                                           80.82)                            the ethical approval of all relevant bodies and that such approvals are
  FnCapacitor           Latency    276.29       3.29       (273.97,          0.120           acknowledged within the manuscript.
    [18]                (ms)                               278.61)                               IRB approval was obtained (required for studies and series of 3 or
                        Cost ($)   0.00255      0.00003    (0.00253,         0.035

                                                                                             more cases)
                                                           0.00257)
                       QoS ( %)   89.88        0.41       (89.59,           0.280
                                                                                                 Written consent to publish potentially identifying information, such
                                                           90.17)                            as details or the case and photographs, was obtained from the patient(s)
  Auto Opt              Latency    267.92       1.65       (266.74,          –               or their legal guardian(s).
   Mem                  (ms)                               269.10)
                        Cost ($)   0.00221      0.00002    (0.00220,
                                                                                             Authorship
                                                                            –
                                                           0.00222)
                       QoS ( %)   91.17        0.24       (91.00,           –
                                                           91.34)                               The International Committee of Medical Journal Editors (ICMJE)
                                                                                             recommends that authorship be based on the following four criteria:

optimal point. As a result, it will lead to irregular updates and require                    1. Substantial contributions to the conception or design of the work; or
additional iterations to correct errors, which will increase memory                             the acquisition, analysis, or interpretation of data for the work; AND
consumption. And a low learning rate helps in more stable and gradual                        2. Drafting the work or revising it critically for important intellectual
convergence, but may require more periods to reach convergence, which                           content; AND
can increase the memory load due to long-term storage of intermediate                        3. Final approval of the version to be published; AND
states. Future research directions include extending Auto Opt Mem to
multi-cloud environments, extending and stress-testing Auto Opt Mem’s


                                                                                        17
Z. Shojaee Rad et al.                                                                                            Computer Standards & Interfaces 97 (2026) 104098


                                                          Fig. 8. Compare performance metrics.


4. Agreement to be accountable for all aspects of the work in ensuring            been approved by all named authors.
   that questions related to the accuracy or integrity of any part of the
   work are appropriately investigated and resolved.                              Contact with the editorial office

    All those designated as authors should meet all four criteria for                 The Corresponding Author declared on the title page of the manu
authorship, and all who meet the four criteria should be identified as            script is:
authors. For more information on authorship, please see https://www.                  [Mostafa Ghobaei-Arani]
icmje.                                                                                This author submitted this manuscript using his/her account in
org/recommendations/browse/roles-and-responsibilities/defining-th                 EVISE.
e-role-of-authors-and-contributors.html#two.                                          We understand that this Corresponding Author is the sole contact for
    All listed authors meet the ICMJE criteria. We attest that all authors        the Editorial process (including EVISE and direct communications with
contributed significantly to the creation of this manuscript, each having         the office). He/she is responsible for communicating with the other
fulfilled criteria as established by the ICMJE.                                   authors about progress, submissions of revisions and final approval of
    One or more listed authors do(es) not meet the ICMJE criteria.                proofs.
    We believe these individuals should be listed as authors because:                 We confirm that the email address shown below is accessible by the
    [Please elaborate below]                                                      Corresponding Author, is the address to which Corresponding Author’s
    We confirm that the manuscript has been read and approved by all              EVISE account is linked, and has been configured to accept email from
named authors.                                                                    the editorial office of American Journal of Ophthalmology Case Reports:
    We confirm that the order of authors listed in the manuscript has                 Someone other than the Corresponding Author declared above


                                                                             18
Z. Shojaee Rad et al.                                                                                                                 Computer Standards & Interfaces 97 (2026) 104098


Table 11                                                                                        submitted this manuscript from his/her account in EVISE:
Differentiation of our approach from previous studies.                                              [Insert name below]
  Aspect            Sizeless [17]            FnCapacitor [18]       Our Work                        We understand that this author is the sole contact for the Editorial
                                                                                                process (including EVISE and direct communications with the office).
                                                                 (Auto Opt Mem)
  Methodology       • Multi-target           • Sandboxing,          • Deep                      He/she is responsible for communicating with the other authors,
                    regression using         performance tests,     Reinforcement               including the Corresponding Author, about progress, submissions of
                    monitoring data          and statistical/       Learning with MAPE          revisions and final approval of proofs.
                    from a single            DNN modeling           control loop
                    memory size
  Decision          • Predicts execution     • Estimate’s           • Learns and selects        CRediT authorship contribution statement
   Type             time and cost for        function capacity      optimal memory
                    other memory sizes       (max concurrency       configuration                  Zahra Shojaee Rad: Resources, Methodology, Investigation, Fund
                                             under SLO)
                                                                                                ing acquisition, Formal analysis, Data curation, Conceptualization.
  Adaptivity        • Static once            • Requires offline     • Dynamic and
                    trained, no              profiling for          continuous                  Mostafa Ghobaei-Arani: Writing – review & editing, Writing – original
                    continuous learning      changes                adaptation at               draft, Visualization, Validation, Supervision, Software, Resources,
                                                                    runtime                     Project administration. Reza Ahsan: Writing – review & editing,
  Focus             • Memory-                • Function             • Memory                    Writing – original draft.
                    performance trade-       capacity and           optimization
                    offs                     concurrency            balancing latency,
                                                                    cost, QoS, and
                                                                    utilization                 Declaration of competing interest
  Limitation        • No runtime             • Re-profiling         —
                    adaptability             needed for new                                         Potential conflict of interest exists:
                                             workloads
  Innovation        • Efficient              • Accurate FC          • Self-adaptive and
                                                                                                    We wish to draw the attention of the Editor to the following facts,
                    prediction with          estimation for         autonomous                  which may be considered as potential conflicts of interest, and to sig
                    limited input            functions              optimization                nificant financial contributions to this work:
                                                                                                    The nature of potential conflict of interest is described below:
                                                                                                    No conflict of interest exists.
Table 12                                                                                            The authors declare that they have no known competing financial
Application Scenarios for Auto Opt Mem.                                                         interests or personal relationships that could have appeared to influence
  Scenario                  Type of Workload       Role of Auto Opt Mem
                                                                                                the work reported in this paper.

  • ML inference            • CPU-bound            • Optimizes latency and cost by
                                                                                                Data availability
                                                   adjusting memory/CPU
  • Video transcoding       • CPU & memory-        • Balances higher memory cost with
                            intensive              faster execution time                            Data will be made available on request.
  • Data preprocessing      • Mixed (CPU + I/      • Adapts memory allocation based on
    (ETL)                   O)                     input size
  • API aggregation         • I/O-bound            • Keeps memory low while ensuring            References
                                                   QoS in parallel API calls
                                                                                                 [1] Ioana Baldini, Paul Castro, Kerry Chang, Perry Cheng, Stephen Fink,
                                                                                                     Vatche Ishakian, Nick Mitchell, et al., Serverless computing: current trends and
                                                                                                     open problems, Res. Adv. Cloud Comput. (2017) 1–20.
                                                                                                 [2] A. Ebrahimi, M. Ghobaei-Arani, H. Saboohi, Cold start latency mitigation
Table 13
                                                                                                     mechanisms in serverless computing: taxonomy, review, and future directions,
Comparative analysis with recent studies in the field of intelligent cloud                           J. Syst. Archit. 151 (2024) 103115.
computing.                                                                                       [3] AWS, “Serverlessvideo: connect with users around the world!.” https://serverless
                                                                                                     land.com/, 2023.
  Ref      Focus Area         Technique          Key                  Relation to
                                                                                                 [4] AWS, “Serverless case study - netflix.” https://dashbird.io/blog/serverless-case-s
                                                 Contribution         Present Work
                                                                                                     tudy-netflix/, 2020.
  [59]     Secure data        Convergent         Reduces              Our DRL approach           [5] CapitalOne, “Capital one saves developer time and reduces costs by going
           deduplication      encryption         redundancy and       similarly targets              serverless on aws.” https://aws.amazon.com/solutions/case-studies/capital-on
                                                 ensures secure       resource                       e-lambda-ecs-case-study/, 2023.
                                                 cloud storage        efficiency but in          [6] E. Johnson, “Deploying ml models with serverless templates.” https://aws.amazon.
                                                                                                     com/blogs/compute/deploying-machine-learning-models-with-serverless-tem
                                                                      serverless memory
                                                                                                     plates/, 2021.
                                                                      optimization
                                                                                                 [7] A. Sojasingarayar, “Build and deploy llm application in aws.” https://medium.
  [60]     Cloud key          Machine            Intelligent key      Both emphasize                 com/@abonia/build-and-deploy-llm-application-in-aws-cca46c662749, 2024.
           management         learning-          lifecycle            intelligent                [8] A. Gholami, M. Ghobaei-Arani, A trust model based on quality of service in cloud
                              based security     management           automation for                 computing environment, Int. J. Database Theor. Appl. 8 (5) (2015) 161–170,
                              framework                               secure and                     https://doi.org/10.14257/ijdta.2015.8.5.13.
                                                                      efficient cloud            [9] DataDog. 2020. The State of Serverless. https://www.datadoghq.com/state-of-se
                                                                      operations                     rverless/.
  [61]     Deep learning      Deep learning      Provides insight     Inspires our DL-          [10] M. Tari, M. Ghobaei-Arani, J. Pouramini, M. Ghorbian, Auto-scaling mechanisms in
           for cloud/         models survey      into intelligent     driven resource                serverless computing: a comprehensive review, Comput. Sci. Rev. 53 (2024)
           edge/fog/IoT                          distributed          optimization                   100650, https://doi.org/10.1016/j.cosrev.2024.100650.
                                                 learning             framework                 [11] M. Ghorbian, M. Ghobaei-Arani, R. Asadolahpour-Karimi, Function placement
                                                 paradigms                                           approaches in serverless computing: A survey, J. Syst. Archit. 157 (2024) 103291.
                                                                                                [12] B. Jacob, R. Lanyon-Hogg, D.K. Nadgir, A.F. Yassin, A Practical Guide to the IBM
  [62]     Cloud security     Deep               Enhances             Our work extends
                                                                                                     Autonomic Computing toolkit. IBM, International Technical Support Organization,
           and privacy        learning-          privacy and          this intelligence
                                                                                                     2004.
                              based attack       adaptive threat      toward
                                                                                                [13] Michael Maurer, Ivan Breskovic, Vincent C. Emeakaroha, Ivona Brandic, Revealing
                              detection          response             performance                    the MAPE loop for the autonomic management of cloud infrastructures, in: 2011
                                                                      optimization and               IEEE Symposium on Computers and Communications (ISCC), IEEE, 2011,
                                                                      QoS improvement                pp. 147–152.
                                                                      in serverless             [14] Russell, Stuart J., and Peter Norvig. Artificial intelligence: a modern approach.
                                                                      systems                        pearson, 2016.
                                                                                                [15] Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore, Reinforcement
                                                                                                     learning: a survey, J. Artif. Intell. Res. 4 (1996) 237–285.


                                                                                           19
Z. Shojaee Rad et al.                                                                                                                 Computer Standards & Interfaces 97 (2026) 104098

[16] Rajkumar Rajavel, Mala Thangarathanam, Adaptive probabilistic behavioural                  [40] Zahra Shojaee rad, Mostafa Ghobaei-Arani, Reza Ahsan, Memory orchestration
     learning system for the effective behavioural decision in cloud trading negotiation             mechanisms in serverless computing: a taxonomy, review and future directions,
     market, Fut. Gener. Comput. Syst. 58 (2016) 29–41.                                              Cluster. Comput. (2024) 1–27.
[17] Simon Eismann, Long Bui, Johannes Grohmann, Cristina Abad, Nikolas Herbst,                 [41] R. Wolski, C. Krintz, F. Bakir, G. George, W.-T. Lin, Cspot: portable, multi-scale
     Samuel Kounev, Sizeless: predicting the optimal size of serverless functions, in:               functions-as-a-service for iot, in: Proceedings of the 4th ACM/IEEE Symposium on
     Proceedings of the 22nd International Middleware Conference, 2021, pp. 248–259.                 Edge Computing (SEC ‘19). Association for Computing Machinery, New York,
[18] Anshul Jindal, Mohak Chadha, Shajulin Benedict, Michael Gerndt, Estimating the                  2019, pp. 236–249, https://doi.org/10.1145/3318216.3363314.
     capacities of function-as-a-service functions, in: In Proceedings of the 14th IEEE/        [42] V. Yussupov, U. Breitenbucher, F. Leymann, M. Wurster, A systematic mapping
     ACM International Conference on Utility and Cloud Computing Companion, 2021,                    study on engineering function-as-a-service platforms and tools, in: Proceedings of
     pp. 1–8.                                                                                        the 12th IEEE/ACM International Conference on Utility and Cloud Computing
[19] Djob Mvondo, Mathieu Bacou, Kevin Nguetchouang, Lucien Ngale,                                   (UCC’19). Association for Computing Machinery, New York, 2019, pp. 229–240,
     Stéphane Pouget, Josiane Kouam, Renaud Lachaize, et al., OFC: an opportunistic                 https://doi.org/10.1145/3344341.3368803.
     caching system for FaaS platforms, in: Proceedings of the Sixteenth European               [43] Zahra Shojaee Rad, Mostafa Ghobaei-Arani, Federated serverless cloud approaches:
     Conference on Computer Systems, 2021, pp. 228–244.                                              a comprehensive review, Comput. Electric. Eng. 124 (2025) 110372.
[20] Myung-Hyun Kim, Jaehak Lee, Heonchang Yu, Eunyoung Lee, Improving memory                   [44] Ioana Baldini, Paul Castro, Kerry Chang, Perry Cheng, Stephen Fink,
     utilization by sharing DNN models for serverless inference, in: 2023 IEEE                       Vatche Ishakian, Nick Mitchell, et al., Serverless computing: current trends and
     International Conference on Consumer Electronics (ICCE), IEEE, 2023, pp. 1–6.                   open problems, Res. Adv. Cloud Comput. (2017) 1–20.
[21] Agarwal, Siddharth, Maria A. Rodriguez, and Rajkumar Buyya. "Input-based                   [45] M. Elsakhawy, M. Bauer, Faas2f: a framework for autoining execution-sla in
     ensemble-learning method for dynamic memory configuration of serverless                         serverless computing, in: 2020 IEEE Cloud Summit, 2020, pp. 58–65, https://doi.
     computing functions." arXiv preprint arXiv:2411.07444 (2024).                                   org/10.1109/IEEECloudSummit48914.2020.00015.
[22] Gor Safaryan, Anshul Jindal, Mohak Chadha, Michael Gerndt, SLAM: sLO-aware                 [46] A.U. Gias, G. Casale, Cocoa: cold start aware capacity planning for function-as-a-
     memory optimization for serverless applications, in: 2022 IEEE 15th International               service platforms, in: 2020 28th International Symposium on Modeling, Analysis,
     Conference on Cloud Computing (CLOUD), IEEE, 2022, pp. 30–39.                                   and Simulation of Computer and Telecommunication Systems (MASCOTS), 2020,
[23] Robert Cordingly, Sonia Xu, Wes Lloyd, Function memory optimization for                         pp. 1–8, https://doi.org/10.1109/MASCOTS50786.2020.9285966.
     heterogeneous serverless platforms with cpu time accounting, in: 2022 IEEE                 [47] C.K. Dehury, S.N. Srirama, T.R. Chhetri, Ccodamic: a framework for coherent
     International Conference on Cloud Engineering (IC2E), IEEE, 2022, pp. 104–115.                  coordination of data migration and computation platforms, Futur. Gener. Comput.
[24] Tetiana Zubko, Anshul Jindal, Mohak Chadha, Michael Gerndt, Maff: self-adaptive                 Syst. 109 (2020) 1–16, https://doi.org/10.1016/j.future, 2020.03.029.
     memory optimization for serverless functions, in: European Conference on Service-          [48] A. Tariq, A. Pahl, S. Nimmagadda, E. Rozner, S. Lanka, Sequoia: enabling quality-
     Oriented and Cloud Computing, Cham: Springer International Publishing, 2022,                    of-service in serverless computing, in: Proceedings of the 11th ACM Symposium on
     pp. 137–154.                                                                                    Cloud Computing (SoCC ’20). Association for Computing Machinery, New York,
[25] Josef. Spillner, Resource management for cloud functions with memory tracing,                   2020, pp. 311–327, https://doi.org/10.1145/3419111.3421306.
     profiling and autotuning, in: Proceedings of the 2020 Sixth International Workshop         [49] J. Manner, S. Kolb, G. Wirtz, Troubleshooting serverless functions: a combined
     on Serverless Computing, 2020, pp. 13–18.                                                       monitoring and debugging approach, SICS Softw.-Intensiv. Cyber-Phys. Syst. 34 (2)
[26] Zengpeng Li, Huiqun Yu, Guisheng Fan, Time-cost efficient memory configuration                  (2019) 99–104, https://doi.org/10.1007/s00450-019-00398-6.
     for serverless workflow applications, Concurr. Comput.: Pract. Exp. 34 (27) (2022)         [50] J. Nupponen, D. Taibi, Serverless: what it is, what to do and what not to do, in:
     e7308, no.                                                                                      2020 IEEE International Conference on Software Architecture Companion (ICSA-
[27] Andrea Sabbioni, Lorenzo Rosa, Armir Bujari, Luca Foschini, Antonio Corradi,                    C), 2020, pp. 49–50, https://doi.org/10.1109/ICSA-C50368.2020.00016.
     A shared memory approach for function chaining in serverless platforms, in: 2021           [51] G. Cordasco, M. D’Auria, A. Negro, V. Scarano, C. Spagnuolo, Fly: a domain-
     IEEE Symposium on Computers and Communications (ISCC), IEEE, 2021, pp. 1–6.                     specific language for scientific computing on faas, in: U. Schwardmann, C. Boehme,
[28] Aakanksha Saha, Sonika Jindal, EMARS: efficient management and allocation of                    B. Heras D, V. Cardellini, E. Jeannot, A. Salis, C. Schifanella, R.R. Manumachu,
     resources in serverless, in: 2018 IEEE 11th International Conference on Cloud                   D. Schwamborn, L. Ricci, O. Sangyoon, T. Gruber, L. Antonelli, S.L. Scott (Eds.),
     Computing (CLOUD), IEEE, 2018, pp. 827–830.                                                     Euro-Par 2019: Parallel Processing Workshops, Springer, Cham, 2020,
[29] Amit Samanta, Faraz Ahmed, Lianjie Cao, Ryan Stutsman, Puneet Sharma,                           pp. 531–544.
     Persistent memory-aware scheduling for serverless workloads, in: 2023 IEEE                 [52] B. Jambunathan, K. Yoganathan, Architecture decision on using microservices or
     International Parallel and Distributed Processing Symposium Workshops                           serverless functionswith containers, in: 2018 International Conference on Current
     (IPDPSW), IEEE, 2023, pp. 615–621.                                                              Trends Towards Converging Technologies (ICCTCT), 2018, pp. 1–7, https://doi.
[30] Meenakshi Sethunath, Yang Peng, A joint function warm-up and request routing                    org/10.1109/ICCTCT.2018.8551035.
     scheme for performing confident serverless computing, High-Confidence Comput.              [53] A. Keshavarzian, S. Sharifian, S. Seyedin, Modified deep residual network
     2 (3) (2022) 100071 no.                                                                         architecture deployed on serverless framework of iot platform based on human
[31] Anisha Kumari, Manoj Kumar Patra, Bibhudatta Sahoo, Ranjan Kumar Behera,                        activity recognition application, Futur. Gener. Comput. Syst. 101 (2019) 14–28,
     Resource optimization in performance modeling for serverless application, Int. J.               https://doi.org/10.1016/j.future.2019.06.009.
     Inf. Technol. 14 (6) (2022) 2867–2875, no.                                                 [54] Gerald. Tesauro, Temporal difference learning and TD-gammon, Commun. ACM 38
[32] Vahldiek-Oberwagner, Anjo, and Mona Vij. "Meshwa: the case for a memory-safe                    (3) (1995) 58–68.
     software and hardware architecture for serverless computing." arXiv preprint               [55] Eric Rutten, Nicolas Marchand, Daniel Simon, Feedback control as MAPE-K loop in
     arXiv:2211.08056 (2022).                                                                        autonomic computing. Software Engineering for Self-Adaptive Systems III.
[33] Divyanshu Saxena, Tao Ji, Arjun Singhvi, Junaid Khalid, Aditya Akella, Memory                   Assurances: International Seminar, Dagstuhl Castle, Germany, December 15-19,
     deduplication for serverless computing with medes, in: Proceedings of the                       2013, Revised Selected and Invited Papers, Springer International Publishing,
     Seventeenth European Conference on Computer Systems, 2022, pp. 714–729.                         Cham, 2018, pp. 349–373.
[34] Jie Li, Laiping Zhao, Yanan Yang, Kunlin Zhan, Keqiu Li, Tetris: memory-efficient          [56] Evangelina Lara, Leocundo Aguilar, Mauricio A. Sanchez, Jesús A. García, Adaptive
     serverless inference through tensor sharing, in: 2022 USENIX Annual Technical                   security based on mape-k: a survey. Applied Decision-Making: Applications in
     Conference (USENIX ATC 22), 2022.                                                               Computer Sciences and Engineering, Springer International Publishing, Cham,
[35] Dmitrii Ustiugov, Plamen Petrov, Marios Kogias, Edouard Bugnion, Boris Grot,                    2019, pp. 157–183.
     Benchmarking, analysis, and optimization of serverless function snapshots, in:             [57] Jeffrey O. Kephart, David M. Chess, The vision of autonomic computing, Computer.
     Proceedings of the 26th ACM International Conference on Architectural Support                   (Long. Beach. Calif) 36 (1) (2003) 41–50.
     for Programming Languages and Operating Systems, 2021, pp. 559–572.                        [58] Alistair McLean, Roy Sterritt, Autonomic Computing in the Cloud: an overview of
[36] Ao Wang, Jingyuan Zhang, Xiaolong Ma, Ali Anwar, Lukas Rupprecht,                               past, present and future trends, in: The 2023 IARIA Annual Congress on Frontiers
     Dimitrios Skourtis, Vasily Tarasov, Feng Yan, Yue Cheng, {InfiniCache}: exploiting              in Science, Technology, Services, and Applications: Technical Advances and
     ephemeral serverless functions to build a {cost-effective} memory cache, in: 18th               Human Consequences, 2023.
     USENIX conference on file and storage technologies (FAST 20), 2020, pp. 267–281.           [59] Shahnawaz Ahmad, Mohd Arif, Javed Ahmad, Mohd Nazim, Shabana Mehfuz,
[37] Anurag Khandelwal, Yupeng Tang, Rachit Agarwal, Aditya Akella, Ion Stoica, Jiffy:               Convergent encryption enabled secure data deduplication algorithm for cloud
     elastic far-memory for stateful serverless analytics, in: Proceedings of the                    environment, Concurr. Computat.: Pract. Exp. 36 (21) (2024) e8205.
     Seventeenth European Conference on Computer Systems, 2022, pp. 697–713.                    [60] Shahnawaz Ahmad, Shabana Mehfuz, Shabana Urooj, Najah Alsubaie, Machine
[38] Nikolos, Orestis Lagkas, Chloe Alverti, Stratos Psomadakis, Georgios Goumas, and                learning-based intelligent security framework for secure cloud key management,
     Nectarios Koziris. "Fast and efficient memory reclamation for serverless                        Cluster. Comput. 27 (5) (2024) 5953–5979.
     MicroVMs." arXiv preprint arXiv:2411.12893 (2024).                                         [61] Shahnawaz Ahmad, Iman Shakeel, Shabana Mehfuz, Javed Ahmad, Deep learning
[39] Zahra Shojaee Rad, Mostafa Ghobaei-Arani, Data pipeline approaches in serverless                models for cloud, edge, fog, and IoT computing paradigms: survey, recent
     computing: a taxonomy, review, and research trends, J. Big. Data 11 (1) (2024)                  advances, and future directions, Comput. Sci. Rev. 49 (2023) 100568.
     1–42, no.                                                                                  [62] Shahnawaz Ahmad, Mohd Arif, Shabana Mehfuz, Javed Ahmad, Mohd Nazim,
                                                                                                     Deep learning-based cloud security: innovative attack detection and privacy
                                                                                                     focused key management, IEEE Trans. Comput. (2025).


                                                                                           20