Computer Standards & Interfaces 97 (2026) 104098 Contents lists available at ScienceDirect Computer Standards & Interfaces journal homepage: www.elsevier.com/locate/csi An autonomous deep reinforcement learning-based approach for memory configuration in serverless computing Zahra Shojaee Rad , Mostafa Ghobaei-Arani * , Reza Ahsan Department of Computer Engineering, Qo.C., Islamic Azad University, Qom, Iran A R T I C L E I N F O A B S T R A C T Keywords: Serverless computing has become very popular in recent years due to its cost savings and flexibility. Serverless Serverless computing computing is a cloud computing model that allows developers to create and deploy code without having to Memory configuration manage the infrastructure. It has been embraced due to its scalability, cost savings, and ease of use. However, Deep reinforcement learning memory configuration is one of the important challenges in serverless computing due to the transient nature of Autonomous computing Function-as-a-service serverless functions, which are stateless and ephemeral. In this paper, we propose an autonomous approach using deep reinforcement learning and a reward mechanism for memory configuration called Auto Opt Mem. In the Auto Opt Mem mechanism, the system learns to allocate memory resources to serverless functions in a way that balances overall performance and minimizes wastage of resources. Finally, we validate the effectiveness of our solution, the findings revealed that Auto Opt Mem mechanism enhances resource utilization, reduces operation cost and latency, and improves quality of service (QoS). Our experiments demonstrate that Auto Opt Mem mechanism achieves 16.8 % lower latency compared to static allocation, 11.8 % cost reduction, and 6.8 % improve in QoS, resource utilization, and efficient memory allocation compared with base-line methods. 1. Introduction serverless functions in production use the default memory size, indi­ cating that developers often overlook the importance of resource size Serverless computing has emerged as an extended cloud computing [9]. Traditional memory configuration methods are often manual set­ model that offers many advantages in flexibility, scalability and cost tings or static allocation, which may lead to inefficiencies such as efficiency [1]. By separating the management of the underlying infra­ overprovisioning or underutilization. These inefficiencies can lead to structure, developers can focus on writing and deploying code without increased costs or decreased performance and affect the effectiveness of worrying about server provisioning or maintenance. There has been a lot the serverless function. By employing deep learning models, memory of progress in various areas related to serverless computing [2]. One configuration can be automated, leading to an efficient solution. Deep aspect is Function as a Service (FaaS) that increasingly been associated learning models analyze historical data to dynamically predict optimal with a variety of applications, including video streaming platforms [3], memory settings, adapt to varying workloads, and minimize latency and multimedia processing [4], Continuous Integration/Continuous cost. This approach uses the ability of deep learning to identify complex Deployment (CI/CD) pipelines [5], Artificial Intelligence/Machine patterns and relationships in data, enabling more accurate and efficient Learning (AI/ML) inference tasks [6], and query processing for Large resource management. Language Models (LLMs) [7]. FaaS is a serverless cloud computing Recent research shows the importance of intelligent and autonomous model that allows developers to run small, manageable services in iso­ systems in different domains. For instance, Arduino-based IoT automa­ lated environments called function instances [8]. tion systems have shown how lightweight and adaptive architectures Despite these advantages, memory configuration in serverless envi­ can improve efficiency and minimize manual intervention in con­ ronments is a complex challenge due to the transient and stateless nature strained environments [10]. Likewise, autonomous AI frameworks for of serverless functions. Choosing the right amount of memory or fraud detection in the Dark Web demonstrate the ability of self-learning resource size is important and a challenge because it can result in faster mechanisms to adapt to dynamic and unpredictable conditions [11]. execution times and lower costs. A recent survey found that 47 % of These advances further motivate the need for AI-driven autonomic * Corresponding author. E-mail address: mo.ghobaei@iau.ac.ir (M. Ghobaei-Arani). https://doi.org/10.1016/j.csi.2025.104098 Received 2 May 2025; Received in revised form 16 November 2025; Accepted 17 November 2025 Available online 19 November 2025 0920-5489/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies. Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 solutions, such as our proposed Auto Opt Mem framework for serverless 1.4. Paper organization memory configuration. This paper is organized into several sections: Section 2 reviews 1.1. Research gap and motivation related memory configuration methods. Section 3 offers background information. Section 4 presents a comprehensive explanation of the Previous approaches are often static or limited to a specific platform. proposed solution. Section 5 assesses and discusses the experimental The need for real-time adaptability to changing workloads is not met, results. Section 6 presents the discussion. Section 7 presents the con­ and they have relied solely on statistical modeling and lacked the ability clusions with our findings and outlines future research directions. to adapt in real time. The direct relationship between cost, latency, and QoS has rarely been considered in a comprehensive framework. Our 2. Related works work addresses these research gaps by introducing Auto Opt Mem based on MAPE loop and DRL. This section discusses memory configuration approaches in server­ The motivation for this study is that many functions still use default less computing. These approaches are categorized into three main memory values, resulting in wasted resources, increased costs, and groups: machine learning-based approaches, heuristic-based ap­ reduced quality of service (QoS). For example, 47 % of users rely on proaches, and framework-based approaches. default memory settings, which emphasizes the importance of intelli­ gent and adaptive optimization [9]. Addressing this gap is essential to 2.1. Machine learning-based approaches improve performance and cost-effectiveness in serverless environments. Simon Eismann et al. [17] have presented an approach called "Sizeless" for predicting the optimal resource size for serverless functions 1.2. Our approach in cloud computing, based on monitoring data from a single memory size. It highlights the challenges developers face in selecting resource In this paper, we propose an autonomous memory configuration with sizes and shows that the method can achieve an average prediction error deep learning model to predict memory setting based on a combination of 15.3 %, optimizing memory allocation for 79 % of functions, resulting of the concept of the autonomic computing and the deep reinforcement in a 39.7 % speedup and a 2.6 % cost reduction. Anshul Jindal et al. [18] learning (DRL) with the aim of increasing performance and cost- have presented a tool called FnCapacitor for estimating the Function effectiveness. To realize autonomic computing, IBM has introduced a Capacity (FC) of Function-as-a-Service (FaaS) functions in serverless reference framework for autonomic control loops known as the MAPE computing environments. It overcomes performance challenges due to (Monitor, Analyze, Plan, Execute) loop [12,13]. This control MAPE loop system abstractions and dependencies between functions. Through load resembles the general agent model put forth by Russell and Norvig [14], testing and modeling, FnCapacitor provides accurate FC predictions where an intelligent agent observes its surroundings through sensors using statistical methods and deep learning, demonstrating effectiveness and utilizes these observations to decide on actions to take within that on platforms like AWS Lambda and Google Cloud Functions. Djob environment. The proposed approach follows the control MAPE loop, Mvondo et al. [19] have presented OFC, an in-memory caching system which consists of four phases: monitoring (M), analysis (A), planning designed for Functions as a Service (FaaS) platforms to improve per­ (P), and execution (E). First in the monitoring phase, the system ob­ formance by reducing latency during data access. It leverages machine serves and collects the current state of the environment (memory in learning to predict memory requirements for function invocations, uti­ serverless functions). In the analysis phase, the agent which is a deep lizing otherwise wasted memory from over-provisioning and idle sand­ neural network analyzes the observed state and updates its policy based boxes. OFC demonstrates significant execution time improvements for on it and the reward received. In the planning phase, the agent schedules both single-stage and pipelined functions, enhancing efficiency without an action (i.e., memory configuration) based on the learned policy. In requiring changes to existing application code. Myung-Hyun Kim et al. the execution phase, the scheduled action is applied to the environment. [20] have introduced ShmFaas, a serverless platform designed to We utilize Deep Reinforcement Learning (DRL) [15,16] as a improve memory utilization for deep neural network (DNN) inference decision-making tool that leverages the predicted outcomes from the by sharing models in-memory across containers. They address data analysis phase to determine the best memory configuration during the duplication and cold start issues, particularly in resource-constrained planning phase. Reinforcement Learning (RL) is a self-learning system edge cloud environments. Experimental results show that ShmFaas re­ that enhances its effectiveness by continuously interacting with the duces memory usage by over 29.4 % compared to common systems, cloud environment. while maintaining negligible latency overhead and enhancing throughput. Siddharth Agarwal et al. [21] have presented MemFigLess, 1.3. Main contributions an input-aware memory allocation framework for serverless computing functions, designed to resource usage and reduce costs. Using a The main contributions of this research can be summarized as multi-output Random Forest Regression model, it correlates the input follows: features of the function with memory requirements, leading to accurate memory configuration. The evaluation shows that MemFigLess can • We propose an autonomic method using deep reinforcement learning significantly decrease resource allocation and save on runtime costs. method to predict memory configuration, and this method operates Finally, Table 1 shows a comparison of machine learning-based ap­ based on a reward mechanism. proaches from related studies. • We designed a multi-objective reward normalization mechanism that simultaneously balances latency, cost, utilization, and QoS. 2.2. Heuristic-based approaches • We integrated the MAPE-K control loop with deep reinforcement learning (DRL) to enable closed-loop online adaptation. Goor-Safarian et al. [22], have present SLAM, a tool for optimizing • Auto Opt Mem supports real-time continuous learning across varying memory settings for serverless applications consisting of multiple workloads, which clearly differentiates it from static or offline ML- Function-as-a-Service (FaaS) functions. It addresses the issues in based predictors. balancing cost and performance while meeting Service Level Objectives • Experiments validate the effectiveness of the proposed method and (SLOs). By utilizing distributed tracing, SLAM estimates execution times demonstrate performance improvements in metrics such as latency under various memory settings and identifies optimal configurations. and cost. Robert Cordingly et al. [23] presented a method called CPU Time 2 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 Table 1 Comparison of machine learning-based approaches. Ref Metric Method Advantage Disadvantage Tools [17] • Execution time • Multi-target regression model • Reduces execution time • Limited to specific cloud • AWS Lambda providers ​ • Resource consumption • Monitoring data • Decrease cost ​ ​ ​ • Performance overhead • Node.js ​ ​ ​ ​ ​ ​ • Without requiring performance ​ ​ test [18] • Function Capacity (FC) • Statistical and deep learning • Accurate FC predictions • Limited to specific FaaS • Python approaches platforms ​ ​ ​ ​ ​ • FnCapacitor ​ ​ ​ ​ ​ • Google Cloud Functions (GCF) ​ ​ ​ ​ ​ • AWS Lambda [19] • Function Capacity (FC) • OFC tool uses machine learning • Reduction in execution time Cost- • Overhead from cache • Python effective management ​ ​ • Utilizes idle memory • Transparent ​ • Java ​ ​ ​ ​ ​ • OFC ​ ​ ​ ​ ​ • AWS [20] • Efficiency of memory • OFC system shares DNN models • Reduces memory usage • Complexity • Python usage ​ ​ ​ • Minimizes cold start delays ​ • ShmFaas ​ ​ ​ • Minimal code changes ​ • Kubernetes [21] • Memory utilization • Uses input-aware Random Forest • Reduce memory allocation • Limited to specific platforms • AWS Lambda Regression ​ ​ ​ • Reduce costs • Overhead from monitoring • Python ​ ​ ​ ​ ​ • AWS CloudWatch Accounting Memory Selection (CPU-TAMS) to optimize memory con­ characteristics, such as its direct load/store access, can increase per­ figurations for serverless Function-as-a-Service (FaaS) platforms. formance but also lead to bottlenecks when multiple threads concur­ CPU-TAMS uses CPU time accounting and regression modeling methods rently write to it. They propose a PM-aware scheduling system for to provide recommendations that reduce execution time and costs. serverless workloads that optimizes job completion time by managing Tetiana Zubko et al. [24] presented MAFF (Memory Allocation Frame­ concurrent access and improving efficiency while ensuring fairness work for FaaS functions), which is a framework to optimize memory among applications. Meenakshi Sethunath et al. [30] have proposed a allocation for serverless functions automatically. MAFF adapts memory joint function warm-up and request routing scheme for serverless settings based on function requirements and employs various algorithms computing that optimally utilizes both edge and cloud resources. It to minimize costs and execution duration. The framework was tested on addresses like high latency and cold-start delays by maximizing the hit AWS Lambda, demonstrating improved performance compared to ratio of requests. It reduced latency, by considering memory and budget existing memory optimization tools. Josef Spillner [25] discussed constraints. Anisha Kumari, et al. [31] have proposed a performance resource management for serverless functions, focusing on memory model for optimizing resource allocation in serverless applications, tracking, profiling, and automatic tuning. The author outlines the issues addressing like cost estimation and performance evaluation. It in­ that developers face in determining memory allocation due to troduces a greedy optimization algorithm to improve end-to-end coarse-grained configurations from cloud providers. Also proposes tools response time while considering budget constraints. They utilize serv­ to measure memory consumption and dynamically adjust allocations to erless applications on AWS to analyze the trade-offs between perfor­ reduce waste and costs, and improve performance in mance and cost, demonstrating the model’s effectiveness in optimal Function-as-a-Service (FaaS) environments. resource configurations. Finally, Table 2 shows a comparison of Zengpeng Li et al. [26] have presented algorithms for optimizing heuristic-based approaches from related studies. memory configuration in serverless workflow applications, specifically a heuristic urgency-based algorithm (UWC) and a meta-heuristic hybrid 2.3. Framework-based approaches algorithm (BPSO). These algorithms aim to balance execution time and cost for serverless applications, to solve the challenges posed by memory Anjo Vahldiek-Oberwagner et al. [32] have proposed a Memory-Safe allocation and performance modeling. Andrea Sabioni et al. [27] have Software and Hardware Architecture (MeSHwA) to enhance serverless introduced a shared memory approach for function chaining on serv­ computing and microservices by leveraging memory-safe languages like erless platforms and proposed a container-based architecture that in­ Rust and WebAssembly. It aims to reduce infrastructure overheads creases the efficiency of function composition on the same host. By using associated with cloud architectures while improving performance and a message-oriented middleware that operates over shared memory, this security through a unified runtime environment that isolates services approach reduces response latency and improves resource utilization. effectively. Divyanshu Saxena, et al. [33] have presented Medes, a The results show performance improvements in request completion serverless computing framework that improves performance and rates and reduced latency during function execution. Aakanksha Saha resource efficiency by introducing a deduplicated sandbox state. This et al. [28] have presented EMARS, an efficient resource management state reduces memory usage by removing redundant memory chunks system designed for serverless cloud computing, focusing on optimizing across sandboxes, allowing for faster function startups and improved memory allocation for containers. Built on the OpenLambda platform, management of warm and cold states. Experiments show that Medes EMARS uses predictive models based on application workloads to adjust under memory pressure can reduce end-to-end latency and cold start. Ji memory limits dynamically, enhancing resource utilization and Li, et al. [34] designed TETRIS, a memory-efficient serverless platform reducing latency. Experiments demonstrate that tailored memory set­ for deep learning inference. It addresses the memory overconsumption tings improve performance in serverless functions. Amit Samanta, et al. in serverless systems by implementing tensor sharing and runtime [29] have discussed the issues and opportunities of integrating persis­ optimization. It reduces memory usage while increasing function den­ tent memory (PM) into serverless computing. It shows how PM’s unique sity. TETRIS automates memory sharing and instance scheduling, 3 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 Table 2 Comparison of heuristic-based approaches. Ref Metric Method Advantage Disadvantage Tools [22] • Memory configuration effectiveness • Distributed tracing • Balances cost and • Limited to specific • Python based on Service Level Objectives (SLOs) performance platforms ​ ​ • Max-heap-based optimization algorithm ​ ​ • AWS ​ ​ ​ ​ ​ • SLAM [23] • Efficiency • CPU time accounting • Reduces runtime • Limited to specific FaaS • Python platforms ​ ​ • Regression modeling • Reduces cost ​ • AWS ​ ​ ​ ​ ​ • GCF [24] • Effectiveness for FaaS functions • Algorithms Linear, Binary, Gradient • Lower cost • Require specific function • Python Descent for self-adaptive memory profiling optimization ​ ​ ​ • Faster execution ​ • AWS [25] • Efficiency • Utilizes memory tracing, profiling, and • Reduce cost • Requires extensive • AWS autotuning tools profiling data ​ • Cost optimization ​ • Improve ​ • Functracer performance ​ ​ ​ • Autoscaling ​ • Autotuner resource ​ ​ ​ ​ ​ • costcalculator [26] • Efficiency • Heuristic (UWC) and meta-heuristic • Time-cost tradeoff • Complexity • AWS (BPSO) algorithms ​ ​ ​ • Optimal workflow • Computational overhead • Python ​ ​ ​ ​ ​ • UWC ​ ​ ​ ​ ​ • BPSO [27] • Response latency • Shared-memory, using a message-oriented • Improve request • Limited to co-located • Message-oriented middleware completion functions middleware ​ • Resource usage ​ • Improve rates • Complexity in managing ​ response time shared memory ​ ​ ​ • Optimized resource ​ ​ usage [28] • Memory allocation efficiency • Workload-based model • Optimizes resource • Complexity • Open Lambda usage ​ • Latency • Memory-based model • Reduce latency ​ • Python [29] • Throughput • Using performance modeling and • Improve • Complexity • Open FaaS admission control for concurrent access throughput ​ • Job completion time (JCT) ​ • Reduce latency ​ • Intel Optane DCPMM [30] • Latency reduction • Joint function warm-up • Reduces cold-start • Complexity • AWS latency ​ • Request hit ratio • Routing for edge and cloud collaboration • Optimize • Dependent on accurate • Azure Function performance profiling [31] • End to end response time • Greedy optimization algorithm • Reduce latency • Complexity • Amazon Web Service ​ • Cost ​ • Reduce cost • Dependent on accurate • AWS profiling ​ ​ ​ • handles cold start ​ ​ delay providing efficient resource utilization without compromising environments, for Function-as-a-Service (FaaS) models using microVMs. performance. It addresses the issues of memory elasticity, during scaling down, by Dmitrii Ustiugov, et al. [35] have discussed cold-start latency in segregating memory allocations for individual function instances, serverless computing and introduces vHive, an open-source framework thereby enabling rapid and efficient reclamation without the overhead for experimentation. It shows the inefficiencies of snapshot-based of page migrations. HotMem improves memory management perfor­ function invocations, which can result in high execution times due to mance, maintaining low latency for function execution. Finally, Table 3 frequent page faults. They propose REAP, that prefetches memory pages shows a comparison of framework-based approaches from related to reduce cold-start delays by 3.7 times, improving performance of studies. serverless functions. Ao Wang, et al. [36] have presented INFINICACHE, an innovative in-memory object caching system leveraging ephemeral 3. Background serverless functions to provide a cost-effective solution for large object caching in web applications. It shows the system’s ability to achieve cost In this section, we explain the concepts of serverless computing and savings while maintaining high data availability and performance then provide explanations about memory configuration in serverless through techniques like erasure coding and intelligent resource man­ computing. agement. Anurag Khandelwal et al. [37] have presented Jiffy, an elastic far-memory system for serverless analytics. It overcomes the limitations 3.1. Serverless computing of existing memory allocation systems by enabling fine-grained, block-level resource allocation, allowing jobs to meet their real-time Serverless computing enables developers to concentrate on coding memory needs. Jiffy minimizes performance degradation and resource without the necessity of managing or provisioning servers, which is why underutilization by dynamically managing memory for individual tasks. it’s termed "serverless." Serverless computing provides an efficient and Orestis Lagkas Nikolos, et al. [38] have introduced HotMem, a mecha­ scalable solution for running programs. Its ease of management and nism designed to enhance memory reclamation in serverless computing lightweight features have made it popular as an implementation model 4 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 Table 3 Comparison of framework-based approaches. Ref Metric Method Advantage Disadvantage Tools [32] • Performance and • MeSHwA, a memory-safe software and hardware • Increase security and • Complexity • Python resource efficiency architecture performance ​ ​ ​ ​ ​ • Rust ​ ​ ​ • Resource sharing ​ • Wasm [33] • End-to-end latency • Medes, a framework utilizing memory deduplication • Reduce cold start • Complexity • AWS to create a new deduplicated sandbox Lambda ​ • Memory usage ​ • Increase flexibility • Overhead from deduplication • Python ​ ​ ​ ​ ​ ​ ​ ​ ​ • Optimal memory ​ • Open Whisk ​ ​ ​ ​ ​ • Open FaaS ​ ​ ​ ​ ​ • Function Bench ​ ​ ​ ​ ​ • CRIU [34] • Function startup latency • Runtime sharing to reduce memory consumption • Memory savings • Complexity • Open FaaS ​ ​ ​ • Reduce cold start • Tensor management overhead • TensorFlow [35] • Cold-start latency • REAP, a mechanism that records and prefetches a • Reduce cold-start ​ ​ function’s working set latency ​ • Overhead the first • Python s ​ ​ ​ invocation ​ • Memory efficiency ​ ​ ​ • Kubernete [36] • Latency of function • INFINICACHE, caching through tensor sharing and • Cost saving • Limited to large object caching • AWS invocation erasure coding Lambda ​ ​ ​ • Data availability • Relying on the limitations of ​ serverless architecture [37] • Execution time • Jiffy, an elastic far-memory • Improvement execution • Complexity • Amazon EC2 time ​ • Memory utilization ​ • Increase resource • Relies on specific serverless • Python utilization architecture ​ ​ ​ ​ ​ • C++ ​ ​ ​ ​ ​ • Java [38] • Memory speed • HotMem, a memory management framework with • Faster reclaim memory • Challenges in memory • Open whisk rapid reclamation of hotplugged memory management ​ • Tail latency ​ ​ ​ • Azure in cloud computing [39]. Serverless computing is an abstraction of cloud when there are fewer concurrent users. This elasticity transforms computing infrastructure. Serverless computing is a cloud computing serverless computing models into a pay-as-value billing model [39]. model in which a cloud provider or a third-party vendor manages the • Efficiency and performance: Developers do not need to perform company’s servers. The company does not need to purchase, install, complex tasks like multi-threading, HTTP requests, etc. FaaS forces host, and manage the servers. Instead, the cloud administrator provides developers to focus on building the application rather than config­ all of these services. uring it. Serverless computing, also known as Function-as-a-Service (FaaS), • Programming languages: Serverless computing supports many pro­ ensures that the code exposed to the developer consists of simple, event- gramming languages, including JavaScript, Java, Python, Go, C#, driven functions. As a result, developers can focus more on writing code and Swift [44]. This versatility allows developers to choose the and delivering innovative solutions, without the hassle of creating test language that best suits their project needs and expertise, increasing environments and provisioning and managing servers for web-based productivity and enabling rapid application development. applications [40]. FaaS and the term “serverless” can be used inter­ changeably, with serverless computing being FaaS. This is because the Challenges of Serverless Computing: FaaS platform automatically configures and maintains the context of the functions and connects them to cloud services without the need for • Vendor lock-in: Vendors typically use proprietary technologies to developers to provide a server [41,42]. enable their serverless services. This can create problems for users Features of Serverless Computing: who want to migrate their workloads to another platform. When migrating to another provider, changes to the code and application • Pay-per-use: In serverless computing, users pay only for the time architecture are inevitable [45]. their code uses dedicated CPU and storage. Pay-per-use pricing • Cold start latency: Serverless services can experience a latency models reduce costs. But in cloud services, users pay for over­ known as “cold start.” When the service is first started, it takes some provisioning of resources like storage and CPU time, which often sit time for the service to respond. The reason for this is the initial idle [39,43]. configuration of the cloud service provider and resource allocation • Speed: Teams can move from idea to market more quickly because and initialization of the infrastructure. This initial delay can be a they can focus entirely on coding, testing, and iterating without the concern in systems that respond to many requests per second. operational costs and server management. There is no need to update Methods and techniques are important for mitigating the cold start fundamental infrastructure like operating systems or other software problem [46–48]. patches, allowing teams to concentrate on building the best possible • Debugging complexity: Debugging serverless functions is difficult features without worrying about the underlying infrastructure and due to their transient nature. The reason for this problem is that resources. Serverless functions typically do not maintain the state of previous • Scalability and elasticity: The cloud vendor is responsible for auto­ calls (stateless). Also, serverless functions are stateless by design, matically scaling up capabilities and technologies to meet customer which can complicate application state management. Also, reports demands. Serverless functions should automatically scale down on serverless function calls should be sent to the developer. These 5 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 reports should include detailed stack traces so that developers can 3.3. Deep reinforcement learning identify the cause of an error. Stack tracing is currently not available for serverless computing, meaning developers cannot easily identify Neural networks were combined with reinforcement learning for the the cause of an error [49–51]. first time in 1991 when Gerald Tsauro used reinforcement learning to • Architectural complexity: A serverless application may consist of train a neural network to play backgammon at the master level [54]. multiple functions. The more functions there are, the more complex Deep reinforcement learning (DRL) is a combination of reinforcement the architecture becomes. This is because each function must be learning and deep neural networks. In reinforcement learning, an agent, properly linked to other functions and services. Also, managing this while interacting with its environment learns a policy incrementally that large number of functions and services can be difficult [52]. enables it to maximize long-term rewards. This approach enables the • Long-running: Serverless computing with long-running functions learning of much more complicated policies that are suitable for executes the function in a limited and short execution time, while high-dimensional problems by combining with deep neural networks. some tasks may require a long execution time. These functions do not Fig. 1 shows the elements in a reinforcement learning model. support long-running execution because these functions are stateless, Deep reinforcement learning is used in various fields such as com­ meaning that if the function is stopped, it cannot be resumed [53]. puter games, robotics, resource management in distributed systems, and performance optimization of cloud computing systems. Deep learning 3.2. Memory configuration techniques can be used in memory configuration in serverless computing. And predicted the amount of memory required to run a Memory configuration in serverless computing is an important serverless application. aspect that influences application performance, resource efficiency, and Serverless computing is a computing model in which the cloud ser­ cost management. Understanding how to optimally allocate memory for vice provider manages the infrastructure and server resources. A serverless functions is important for developers who want to maximize developer just needs to write his code and upload it onto the serverless the benefits of serverless computing [40]. In serverless environments, platform. The advantages of serverless computing include many things: developers deploy small units of code, known as functions, which are automatic scalability, cost per resource, ease of management. Deep executed in response to specific events. Each function operates in a learning can be used to improve memory configuration in serverless stateless manner and is allocated a certain amount of memory at run­ computing. Deep learning can automatically identify memory usage time. The memory configuration affects several factors: patterns in applications and allocate the required memory based on that. Benefits of using deep learning for automatic memory configuration in • Performance: The memory allocated for a function will determine serverless computing applications: execution time. The more the memory, the faster, as it allows the CPU to perform better and cold start latency to reduce. But too less • Automation: With deep learning, the technology can easily identify memory can make it very slow and cause downtime. patterns concerning memory usage in an application and automate • Cost efficiency: The pricing of serverless computing is pay-per-use; the memory needed for allocation. Consequently, this will reduce the costs are determined by the amount of resources utilized in time spent on memory configuration. execution. Memory configuration can help prevent over-allocation of • Optimization: Deep learning can learn how to optimize memory memory, which leads to high costs. However, a lack of memory will usage by managing how memory is allocated. It helps to decrease lead to performance issues that need to be balanced [40]. computing costs and enhances the performance of applications. • Scalability: Memory configuration is performed in such a way that • Flexibility: Deep learning can learn from changes in memory usage serverless applications will be able to scale seamlessly according to patterns. It can help in the improvement of an application over time. variable workloads. Since the demand fluctuates, dynamically allo­ cating memory helps achieve good performance without incurring Deep learning to automate the configuration of memory in serverless excessive costs. computing faces some challenges. First, there is a need for training data. Deep learning requires enough training data in order to identify patterns Memory configuration optimization and automated, data-driven in memory usage. Another challenge is the complexity of deep learning, approaches to memory management in serverless computing increase which also requires a lot of time to learn. However, using deep learning performance and scalability, and help save costs. These methods can for automating the configuration of memory in serverless computing allow developers to minimize the challenges of manual memory offers numerous benefits. It will improve memory configuration tech­ configuration. niques and reduce computational costs. Fig. 1. Elements in a reinforcement learning model. 6 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 3.4. Autonomic computing utilization of these functions. Auto Opt Mem utilizes Deep Reinforce­ ment Learning (DRL), a branch of artificial intelligence, to learn an The MAPE-K model provides a framework for the management of optimal memory configuration policy. DRL enables the system to learn autonomous and self-adaptive systems [55–57]. The model comprises from experience and make decisions based on a reward mechanism [15, five major components: Monitoring, Analysis, Planning, Execution, and 16]. In this regard, Auto Opt Mem learns the memory resource allocation Knowledge Management, illustrated in Fig. 2. to SFs for maximum performance while minimizing waste of resources. Monitoring is where data is gathered on an ongoing basis to measure By employing DRL, Auto Opt Mem takes into account resource the system’s current state and functionality relative to predetermined constraints in the serverless environment, resource requirements of goals. The subsequent Analysis stage examines this data to bring to light functions, energy consumption, latency, and deployment costs. The any differences between the present condition and desired outcomes, system learns the optimal memory configuration policy through a giving the information necessary for adjustment. After problem detec­ training process that involves interacting with the environment and tion, the Planning stage creates schemes to amend these disparities, receiving feedback in the form of rewards. The automatic memory specifying how the system needs to modify its behavior. The Execution configuration learning approach in Auto Opt Mem enables the system to stage then enacts these strategies, modifying the system’s behavior to dynamically adapt to changing conditions and optimize memory attain the desired results. Knowledge Management serves as a repository resource allocation based on the specific needs of each SF. This approach of knowledge for the important information shared among the various contributes to efficient resource utilization and improved performance stages. Overall, the MAPE-K model offers a feedback loop that allows in serverless computing environments. This work proposes a framework autonomous systems to adapt to new situations to ensure optimum called Auto Opt Mem for automatic optimal memory that uses a deep performance by constantly monitoring and adjusting. This model is learning agent to automatically optimize the memory allocated to each required to build resilient and efficient autonomous systems. In addi­ serverless function with respect to computational resources. The pri­ tion, autonomous control systems greatly improve the efficiency and mary goal of Auto Opt Mem, utilizing Deep Reinforcement Learning reliability of industrial processes by minimizing human error and (DRL), is to optimize memory allocation for serverless functions. In ensuring consistent performance. Autonomous control systems are other words, the main goal is to dynamically adjust the memory con­ closed-loop controls that are preprogrammed to operate autonomously figurations to become highly performant, cost-effective, and efficient without the intervention of an operator, and they ensure predictable regarding resource utilization. Indeed, it seeks to optimize memory for results in complex industrial settings. Therefore, the integration of serverless functions to minimize costs and reduce latency while main­ automated control systems and sophisticated monitoring techniques taining or improving Quality of Service (QoS). makes operations simpler and more reliable, and industrial processes become a necessity in contemporary manufacturing plants [58]. 4.1.1. Key components This section describes the elements of the Auto Opt Mem framework 4. Proposed approach that are essential to the automating memory configuration process. Environment In this section, we will explain our proposed approach in more detail. The environment is the platform of serverless, such as AWS Lambda First, we will introduce the framework that utilizes machine learning for or Google Cloud Functions where a set of functions are deployed and memory configuration in serverless computing, followed by the pre­ executed within. It reflects resource usage information, performance sentation of a formula and algorithm. Finally, we describe the autono­ metrics, and all other relevant status information. mous memory configuration using deep learning to predict memory Agent setting for serverless computing. The agent does the job of making decisions about the memory size to be assigned to each function. It possesses one DRL model that tries to 4.1. Proposed framework learn through interactions with the environment. State In the proposed solution, Auto Opt Mem introduces an autonomous The state St at time t includes: memory configuration approach based on learning for serverless computing. The goal of Auto Opt Mem is to address the challenge of • Current memory allocation for each function. efficiently distributing serverless functions (SF) in a serverless envi­ • Performance metrics (e.g., execution time, latency). ronment while considering a number of various real-world parameters. • Number of requests received. One of the key parameters to consider within Auto Opt Mem is memory • Cost metrics. configuration, a term that shows how much of the available memory • Specific performance requirements. resources are to be allocated for the serverless function. Memory configuration has a direct impact on the performance and resource Action The action At at time t is the choice of memory size from some pre­ defined set of configurations for that particular function. Reward The reward Rt at time t is the signal that informs feedback and therefore controls learning. It needs to be defined with the aim of: • Encourage efficient memory usage. • Improve performance metrics. • Maintain or increase Quality of Service (QoS). • Minimize operational costs. Policy Policy is a strategy by which the agent makes decisions to select actions according to the current state. The DRL model implements the policy. Fig. 2. Autonomic computing (MAPE-K loop) [57]. Fig. 3 shows the iterative loop diagram in deep reinforcement 7 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 Fig. 3. Iterative loop diagram in deep reinforcement learning algorithms. learning algorithms. Table 4 List of Mathematical Symbols. 4.2. Problem formulation Definition Notation This section presents a mathematical model for the memory config­ A set of serverless functions Fi uration problem in serverless computing. In the following, we describe Memory allocated to function F є fi Mi Delay of function fi with memory Mi Li (Mi) the used symbol in more details. Cost of executing function fi with memory Mi Ci (Mi) Memory usage for function fi with memory Mi Ui (Mi) 4.2.1. Notation Quality of service criterion for function fi with memory Mi Qi (Mi) Reward at time t Rt • F: Set of serverless functions State space S State space at time t St • Mi: Memory allocated to function F in fi Space of action A • Li (Mi): Latency of function fi with memory Mi Action A at time t At • Ci (Mi): Cost of executing function fi with memory Mi Current memory allocation for each function fi Mi,t • Ui (Mi): Memory utilization for function fi with memory Mi Latency Li,t Cost Ci,t • Qi (Mi): Quality of Service metric for function fi with memory Mi Quality of service Qi,t • Rt: Reward at time t Weighting factor for delay α Weighting factor for cost β The list of mathematical symbols is summarized in Table 4. Weighting factor for usage (utilization) γ The framework and automated approach for learning-based memory Weighting factor for service quality δ Minimum memory size that can be allocated to a function Mi,min configuration in serverless computing is as follows: The maximum amount of memory that can be allocated to a function Mi,max State space (S) Minimum acceptable QoS for the function Fi Qmin i The state St at time t includes: Expected value E Policy π • Current memory allocation Mi,t for each function fi. Value function Vπ Policy network parameters θ • Performance metrics such as latency Li,t , cost Ci,t, and Quality of Value network parameters ϕ Service Qi,t. Action-value function Qπ (s,a) • The number of requests received. Action space (A) learning, guiding the agent toward desired behaviors by associating The action At at time t involves selecting the memory size Mi,t + 1 for rewards or penalties based on the outcome of actions. The reward each function fi. function Rt is designed to: Reward function (R) The reward function is an essential component of reinforcement • Reduce latency and cost. 8 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 • Decrease memory utilization and increase QoS. ( ) Qi Mi,t ≥ Qmin i for alli ∈ F (6) Which is expressed by Eq. (1): Mi,min and Mi,max are the minimum and maximum memory sizes that ( ∑ ( ) can be allocated to the function Fi. ( ) ( ) ( )) ( ) Rt = − i∈f αLi Mi,t + βCi Mi,t − γU Mi,t + δQi Mi,t (1) The minimum acceptable QoS for the function Fi is. This ensures that the QoS is not less than a certain threshold. where (α, β, γ, δ) are the weighting factors for latency, cost, utilization, 4.2.3. Reinforcement learning formula and QoS, respectively. That weights can adjust the relative importance This section explains the mathematical formulas used in the rein­ of each component in the reward function. For example, if reducing forcement learning component of the Auto Opt Mem framework. latency is more important than cost, α can be increased relative to β. Policy (π) Lower latency is desirable; therefore, -αLi (Mi,t) penalizes higher The policy π(a|s) is the probability distribution over actions given the latencies. current state. It is the strategy that the decision-maker considers for its Lower costs are also preferable; thus, -βCi (Mi,t) penalizes higher next action in response to the current state. expenses. Value function (Vπ) Memory utilization is desired to be efficient. Over-allocation of Value function refers to the expected long-term discounted reward, memory leads to resource waste, while under-allocation can degrade which contrasts with short-term rewards (R) as it focuses on the long performance. γUi (Mi,t) rewards the agent for optimal memory usage. term. It represents the expected return in the long-term resulting from Higher QoS is preferred; therefore, +δQi (Mi,t) rewards better service the current state s under policy π. The value function is an important quality. concept in reinforcement learning and represents the expected return To normalize the reward function for optimizing memory configu­ (cumulative reward) starting from state s and following policy π, and is ration in serverless computing, all components (cost, latency, utiliza­ defined by Eq. (7): tion, and QoS) must be on a common scale. Because they have different [ ] units (for example, costs in dollars, latency in milliseconds). Below is an ∑∞ V π (S) = E γt Rt |S0 = s, π (7) approach to achieve this normalization. We will use min-max normali­ t=0 zation for each component. The min-max normalization formula is as follows, and is defined by Eq. (2): Where γ is the discount factor. x − xmin xʹ = (2) • Expected value (E): Due to the random nature of the environment, it xmax − xmin is the average or expected return in all possible future states. ∑ Where: • Sum of reward: ∞ t t=0 γ Rt It represents the Sum of rewards over time. Rewards are discounted by a factor of γt to prioritize immediate • X is the original value, rewards over distant future rewards. • Xmin is the minimum value for that metric, • Discount factor (γ): The discount factor γ (where 0 < γ < 1) de­ • Xmax is the maximum value for that metric. termines the present value of future rewards. A higher γ makes future rewards more significant, while a lower γ emphasizes immediate After normalization, the metrics can be combined into the reward rewards. function without unit conflict, and the final reward formula becomes, • Policy (π): The policy π is the decision rule that gives the probability which is expressed by Eq. (3): of executing action a from a state s. ( ( ( ) ( ) ( ) )) ∑ Li Mi,t − Lmin Ci Mi,t − Cmin Ui Mi,t − Umin Rt = − α +β − γ Value function is important since it is utilized to measure how good a i∈f Lmax − Lmin Cmax − Cmin Umax − Umin given state is under a given policy. It is also a basis for policy comparison ( ) and for determining optimal actions to maximize the long-term reward. Qi Mi,t − Qmin +δ Qmax − Qmin Bellman equation (3) The Bellman equation provides a recursive decomposition for the value function, making it a powerful tool for solving reinforcement Normalization, in effect, normalizes each component into the range learning problems. The Bellman equation for the value function is as of [0, 1], thus making them comparable even if they are from different follows, and is defined by Eq. (8): units. Normalizing the reward function components makes them com­ parable and hence they can be combined meaningfully during the V π (S) = Ea− π [Rt + γV π (S + 1)|St = s, At = a] (8) optimization process. This method increases the robustness of the model Reasons for using the Bellman equation: and enhances convergence in reinforcement learning algorithms. • Recursive nature: The Bellman equation breaks down the value of a 4.2.2. Optimization problem state into the immediate reward Rt plus the discounted value of the First, it is necessary to mathematically formulate the problem in next state γVπ(St+1). The recursion makes computation efficient and terms of optimization and reinforcement learning for the Auto Opt Mem forms the foundation for dynamic programming techniques. algorithm and its solution. The problem of optimization can be done as • Policy evaluation: By repeatedly updating the value function using follows, and is expressed by Eq. (4): the recursive relationship, it aids in evaluating the expected return of T ∑ a policy π. minM Rt (4) • Optimality principle: For finding the optimum policy, the Bellman t=0 optimality equation expresses the relationship between the value of a With the following constraints, and is expressed by Eq. (5): state and the values of subsequent states. This is used in algorithms like value iteration and Q-learning for finding the optimal value Mi,min ≤ Mi,t ≤ Mi,max for alli ∈ F (5) function. Quality of Service constraints, and is expressed by Eq. (6): • Simplification of complexity: The use of a recursive approach al­ lows the Bellman equation to simplify the calculation of the value 9 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 function for all states, which otherwise would be computationally monitoring the number of incoming requests for each function. The infeasible in large state spaces. monitoring phase is important because it records real-time information that indicates system performance and resource utilization. In short: Policy gradient The policy gradient method is used to optimize policy π, and is • Continuously observe the current state of St, including memory defined by Eq. (9): allocation, performance metrics (latency Li,t , cost Ci,t , utilization Ui,t, QoS Qi,t), and incoming request rate. ∇θ J(πθ ) = Es− dπ ,a− πθ [∇θ logπθ (a|s)Qπ (s, a)] (9) • Collect data from serverless functions and the environment. where θ are the parameters of the policy, and Qπ (s,a) is the action-value function. 4.3.2.2. Analyze. In the analysis phase, the system uses the perfor­ The value-action function represents the expected return of taking mance metrics collected during the monitoring phase, and the system action a in state s and then following policy π, providing a basis for accepts the performance measurements collected during the monitoring policy improvement. phase. This is achieved by examining the current performance of the The Bellman equation and value function are the basis of studying serverless functions based on the collected data and calculating a reward and quantifying long-term impact from these decisions and guide the value that controls the learning. The reward function is usually based on policy to optimal function. latency, cost, utilization, and QoS, thus allowing for any inefficiencies or issues that need to be addressed in subsequent phases. In short: 4.3. Proposed algorithm • Evaluate performance metrics Li(Mi), Qi(Mi), Ci(Mi), Ui(Mi). • Calculate reward Rt based on the current state. The proposed Auto Opt Mem algorithm uses deep reinforcement learning to learn how to assign serverless functions to compute resources Which is expressed by Eq. (10): efficiently. Incorporating the MAPE loop, the Auto Opt Mem algorithm ( ( ( ) ( ) ( ) )) continuously monitors, analyzes, plans, and executes actions to optimize ∑ Li Mi,t − Lmin Ci Mi,t − Cmin Ui Mi,t − Umin memory allocation in an automous and adaptive manner. The process of Rt = − α +β − γ i∈f Lmax − Lmin Cmax − Cmin Umax − Umin the algorithm is as follows: ( ) Qi Mi,t − Qmin +δ 4.3.1. Initialization Qmax − Qmin This algorithm is the first step in the deep reinforcement learning (10) (DRL) process for memory configuration. First, two main networks are prepared: the Policy Network, which is responsible for choosing the 4.3.2.3. Plan. In the planning phase, the system decides the next ac­ action (e.g., allocating the amount of memory) in each state. This tions based on the analysis observations. In this stage, best memory network initially starts with random parameters because it has no allocation decisions are selected based on the policies learned from the knowledge at the beginning and gradually learns to make optimal de­ deep reinforcement learning model and updates in policy network pa­ cisions. The Value Network estimates the long-term value of a state rameters for enhanced future decision-making. This planning assists the based on the sum of future rewards. This network also starts with system in adjusting its memory allocation policies effectively based on random parameters. Then, the initial state of the environment (S0) is the current situation and performance analysis. It can be said that: defined, which includes the memory configuration of each function, performance indicators (latency, cost, QoS, and resource utilization), • Use policy network πθ to determine the optimal action At (memory and the rate of incoming requests. This step is the basis of training and allocation for the next time step). provides the basis for the agent’s interaction with the environment. • Update policy and value networks using reinforcement learning This initialization is very important, as it sets the starting point for techniques. Typically involves: training the networks to learn optimal memory allocation for serverless ○ Calculating the policy gradient using the policy gradient method. functions, with the aim of minimizing cost and latency and maintaining ○ Updating the policy network parameters θ. high QoS. This is shown in Algorithm 1. ○ Using the Bellman equation to update the network parameters ϕ. 4.3.2. MAPE loop 4.3.2.4. Execution. The execution phase is responsible for carrying out In this section, we describe an autonomous memory configuration the actions that have been planned. It allocates memory to every func­ includes four phases: monitoring, analysis, planning, and execution, tion according to the decisions of the planning phase and changes to the next state, ready to trigger the MAPE cycle once more. Its role is to 4.3.2.1. Monitor. In the monitoring phase, the system monitors the realize the change from planning and analysis, optimize resource utili­ current state of the environment at all times, and the system is always zation, and improve the system as a whole. In summary: aware of the state of the environment. This includes monitoring the memory allocated to each serverless function, obtaining performance • Apply the selected action to adjust the memory allocation for each data such as latency, cost, utilization, and quality of service (QoS), and function. • Go to the next state St+1 and repeat the loop. Algorithm 1 Pseudo code for initialization phase ( ). The MAPE loop enables a dynamic and automated way of memory 1: Input: Set of serverless functions F management in serverless computing systems, where the system can 2: Output: Initialized policy network πθ, value network Vϕ, and initial state S0 learn and change with new situations on a continuous basis, eventually 3: Initialize policy network πθ with random parameters θ resulting in enhanced performance and resource efficiency, as shown in 4: Initialize value network Vϕ with random parameters ϕ Algorithm 2. 5: Define initial system state S0 including: 6: - Memory allocation for each fi ∈ F 7: - Performance metrics: Latency Li(Mi), Cost Ci(Mi), Utilization Ui(Mi), QoS Qi(Mi) 4.3.3. Training loop 8: - Incoming request rate The training loop is an important part of the Auto Opt Mem algo­ 9: Return (πθ, Vϕ, S0) rithm, which uses deep reinforcement learning to improve how memory 10 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 is allocated for serverless functions. The process starts by resetting the Although AWS Lambda and CloudWatch were used conceptually to environment to its initial state, called S0. For each time step t, the al­ structure the resource and metric model, all experiments were executed gorithm goes through the MAPE loop. It begins by checking the current in a simulated environment. As shown in Table 5. state St, gathering data about memory usage, performance metrics, and The experiments involved tuning a set of hyperparameters. The pa­ other relevant information. After collecting this data, it analyzes it to see rameters were: how well the system is performing and calculates a reward that helps Learning rate: Different learning rates like 0.01, 0.001 and 0.0001 guide future learning. Next, the algorithm plans the next action by were attempted during training. choosing the best way to allocate memory using its policy network. This Discount Factor (γ): Discounting coefficient of 0.99 was chosen to decision is based on the insights gained from the analysis. Once it de­ give higher importance to long-term rewards for the reinforcement cides on an action, the system carries it out, adjusting the memory for learning. each function as needed. Finally, the algorithm moves to the next state Batch size: A batch size of 32 was utilized in training the deep St+1 and prepares to start the loop again. This ongoing process allows the learning models, a trade-off between training time and model accuracy. system to learn and adapt continuously, refining its memory allocation Number of episodes: Training was carried out over 100 episodes, strategies based on real-time feedback. Each cycle helps improve the where the agent learns through experience interacting with the envi­ overall performance and efficiency of memory management in the ronment and tunes its memory allocation policies. serverless environment. In summary, it can be said: Both the policy and value networks are implemented as multilayer perceptrons (MLPs). The input layer receives the state vector, which 1. Environment reset: Reset the environment to its initial state S0. includes memory allocation, latency, cost, utilization, QoS, and request 2. MAPE execution: For each time step t: rate. Each network has two fully connected hidden layers with 128 and • Monitor: Observe the current state St. 64 neurons, respectively, and ReLU activation functions. The policy • Analyze: Evaluate performance and calculate reward. network ends with a softmax output layer producing a probability dis­ • Plan: Select an action using the policy network. tribution over possible memory allocations, while the value network has • Execute: Execute the action and move to the next state St+1. a single linear output estimating the state value. The generated experi­ ence data was split into 70 % for training and 30 % for evaluation, which Algorithm 3 shows the execution phase of the MAPE loop during the is standard in DRL-based optimization studies. training process. 5. Performance evaluation 5.2. Performance metrics In this section, we present the performance evaluation of the novel To evaluate the effectiveness of the proposed approach, we utilize automatic deep learning-based approach (Auto Opt Mem) for memory several performance metrics, including: setting in serverless computing. We describe the experimental setup, Latency: The time taken for a function to execute, measured in performance metrics, and the result of the experiments. milliseconds. Lower latency indicates better performance. Which is expressed by Eq. (11): 5.1. Experimental setup Tend − Tstart L= (11) N We carried out experimental analysis in this study on a Windows 11–64-bit computer with an Intel Core i7 processor. The evaluation uses Where and are the Tstart and Tend times of execution, and N is the a serverless simulation environment, where memory sizes between 128 number of function invocations. MB and 2048 MB are modeled to study their impact on latency, cost, and Cost: The total cost incurred during function execution, measured in quality of service (QoS). A virtual CPU (vCPU) model with burst US dollars. Our goal is to minimize this cost. Which is expressed by Eq. behavior and dynamic workload scaling is incorporated into the simu­ (12): lation. Python was utilized to implement deep reinforcement learning N ∑ algorithms in machine learning. Proximal Policy Optimization algo­ C= (Mi × Pmem + Ti × Pexec ) (12) i=1 rithm was utilized to train the deep reinforcement learning agent as it is known to be stable and efficient in policy optimization procedures. Also, Where Mi is the allocated memory, Pmem is the price per MB, Ti is the the reason for choosing the Proximal Policy Optimization (PPO) algo­ execution time, and Pexec is the execution price per second. rithm is that it is widely suitable for continuous control and policy Quality of Service (QoS): A composite score reflecting the reli­ optimization problems in environments with large dynamic states such ability and user satisfaction of the service. A composite metric based on as Serverless environments. And it has higher stability in training than latency and availability, defined as, and is expressed by Eq. (13): algorithms such as REINFORCE or Vanilla Policy Gradient. It has the 1 ability to control the trade-off between exploration and exploitation in QoS = (13) L + δ(1 − A) dynamic environments where the workload changes randomly. It is suitable and scalable in environments with high-dimensional state Where A is the availability factor (percentage of successful execu­ spaces where multiple parameters such as latency, cost, utilization, and tion) and helps the quality of service to take into account the impact of quality of service (QoS) need to be optimized simultaneously. The availability on the overall system performance. δ is a weighting workloads used in our evaluation consisted of four modeled serverless parameter. scenarios: ML inference, API aggregation, data preprocessing (ETL), and Utilization: The efficiency of memory usage during function video processing. These workloads are designed to emulate the perfor­ execution, aiming for optimal allocation without wastage. Which is mance behavior of serverless applications while being executed in a fully expressed by Eq. (14): simulated environment. Furthermore, all experiments are implemented Mused in Python language, and the source code of simulation can be down­ U= × 100 (14) Mallocated loaded at the GitHub repository.1 Where Mused is the actual memory usage and Mallocated is the assigned memory. A higher value indicates optimal management of memory 1 https://github.com/zahrashj-rad/Auto-Opt-Mem resources. 11 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 Algorithm 2 the model’s weights. We tested different learning rates to examine their Pseudo code for MAPE loop phase ( ). impact on Auto Opt Mem’s efficiency. 1: Input: Current state St, networks (πθ, Vϕ) In this experiment, different learning rates were implemented for the 2: Output: Updated state St+1 problem of prediction and memory regulation in serverless environ­ 3: Monitor: ments, and a reinforcement learning agent based on policy gradient was 4: Collect metrics (Latency Li,t, Cost Ci,t, Utilization Ui,t, QoS Qi,t, Requests) implemented. The goal was to show the impact of learning rate on 5: Analyze: 6: Compute reward Rt using the reward function memory allocation policy learning and ultimately on the performance 7: Rt = − (αLi(Mi,t) + βCi(Mi,t) − γUi(Mi,t)) + δQi(Mi,t) metrics Latency, Cost, QoS, and Utilization. Three learning rate values 8: Normalize all components (cost, latency, utilization, QoS) were tested: 0.01, 0.001, and 0.0001; each experiment was performed 9: using: on 100 episodes. The environment model and the relationships between 10: X’ = (X − Xmin) / (Xmax − Xmin) memory and metrics were implemented by a noisy function to represent 11: Plan: 12: Select action At = πθ (St) the overall system behavior and the impact of memory selection on 13: Update policy network πθ using policy gradient metrics. 14: ∇θ J(πθ) = E₍s-dπ, a-πθ₎ [ ∇θ log πθ(a|s) Qπθ(s,a) ] The learning rate can affect the performance metrics as follows: 15: Update value network Vϕ using Bellman equation Latency: A very high learning rate may destabilize training, pre­ 16: Vπ(S) = E a-π [ Rₜ + γ Vπ (S + 1) | Sₜ = s, Aₜ = a ] 17: Execute: venting the model from converging to an optimal policy. This instability 18: Apply action At (update memory allocation Mi,t + 1) leads to fluctuations in memory allocation and, consequently, higher 19: Transition to next state St+1 execution latency. Conversely, a moderate learning rate supports stable 20: Return St+1 convergence, resulting in lower latency. Cost: If memory allocation decisions are unstable due to an exces­ sively high learning rate, resource usage can increase, raising the Algorithm 3 execution cost. However, in some cases, a higher learning rate can Pseudo code for MAPE Execution phase ( ). accelerate convergence to an efficient allocation, thus reducing costs. 1: Input: Environment, networks (πθ, Vϕ) The impact on cost is therefore dual, depending on whether training 2: Output: Optimized memory allocation policy stabilizes. 3: Initialize environment and obtain initial state S0 Quality of Service (QoS): High learning rates may cause instability 4: For each episode do in memory allocation policies, leading to inconsistent performance and 5: For each time step t do 6: Run MAPE Loop with input St degraded QoS. In contrast, a well-tuned moderate learning rate achieves 7: Observe reward Rt and next state St+1 more stable optimization and improved QoS. 8: Store tuple (St, At, Rt, St+1) Utilization: Rapid but unstable adjustments caused by a high 9: Update πθ and Vϕ based on (St, At, Rt, St+1) learning rate can lead to inefficient memory allocations, resulting in 10: Policy network update 11: ∇θ J(πθ) = E₍s-dπ, a-πθ₎ [ ∇θ log πθ(a|s) Qπθ(s,a) ] either under-utilization or over-utilization of resources. A balanced 12: Value network update learning rate is more likely to achieve efficient utilization. Fig. 4 shows 13: Vπ(S) = E a-π [ Rₜ + γ Vπ (S + 1) | Sₜ = s, Aₜ = a ] the learning rates and results. 14: Set St ← St+1 The cost and quality of service results obtained from training the 15: End For agent with different learning rates showed: 16: End For 17: Return optimized policy πθ The intermediate learning rate LR = 0.001 shows the lowest latency and highest QoS among the three values. In contrast, the cost increases significantly and the utilization decreases. In fact, the agent with LR = 0.001 learns faster and more significantly policies that favor higher Table 5 Tools and technologies. memory (or policies that make allocations that reduce latency) this leads to improved QoS but increases the cost per execution unit. The inter­ Component Description mediate rate allows the agent to accept weight changes strongly enough Cloud Provider AWS Lambda (conceptual model), implemented as a to reach regions with lower latency (but may be cost-ineffective). simulated environment in code. The larger learning rate LR = 0.01 shows very low cost and very high Programming Language Python, for implementing the deep reinforcement learning algorithms. utilization, but latency and QoS remain at moderate levels. In fact, high Deep Learning Framework TensorFlow/Keras, for building and training the deep learning rates usually make large updates; In this simulation, the agent learning models. has arrived at a policy that keeps the cost low (e.g., choosing low or Reinforcement Learning Proximal Policy Optimization (PPO) average memories) while maintaining high utilization. This could mean Algorithm learning a cost-saving policy; however, this policy may cause fluctua­ Monitoring Tools Simulated monitoring module (MAPE-K), no real CloudWatch data used. tions and not reach the optimal latency. It is also possible for the agent to Data Management Pandas, for data manipulation and analysis of the get stuck in a local boundary (with low cost) under noisy gradients. performance metrics. A smaller learning rate of LR = 0.0001 had intermediate results Development Environment Jupyter Notebook / Google Colab for interactive (latency and QoS close to LR=0.01 but average cost). In fact, a too small development and experimentation. rate leads to slow and stable learning; the agent may not have fully converged yet and not have seen significant improvement by the end. 5.3. Experimental results Table 6 shows impact of learning rate effects on performance metrics. We evaluated our proposed approach with baseline methods, including machine learning-based approaches, the impact of learning 5.3.2. Second scenario: reward function formula based on MAPE loop rate and reward function. To further analyze the impact of Auto Opt Mem, we conducted eight experiments and calculated the reward function. As explained in the 5.3.1. First scenario: impact of learning rate on optimization Section 4, the reward function is calculated from the following formula. One of the important hyperparameters in deep reinforcement learning is the learning rate (LR), which controls the extent of updates to 12 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 ( ( ( ) ( ) ( ) )) ∑ Li Mi,t − Lmin Ci Mi,t − Cmin Ui Mi,t − Umin We calculate the reward function for each experiment. Rt = − α +β − γ Experiment 1: i∈f Lmax − Lmin Cmax − Cmin Umax − Umin ( ) ( ) Qi Mi,t − Qmin 90 − 80 0.17 − 0.15 75 − 60 66 − 40 R1 = − + − + +δ Qmax − Qmin 100 − 80 0.50 − 0.15 80 − 60 70 − 40 Where α, β, γ, and δ are the weights for each variable. The negative at R1 = − (0.5 + 0.1333 − 0.75) + 0.8667 = − ( − 0.1167) = 0.1167 the beginning of the formula shows that we want to include the negative The reward function for the remaining experiments is calculated impact of cost and latency in the reward function. Quality of service (Qi) similarly. is considered positive and its positive impact is included in the reward These findings strengthen the ability of Auto Opt Mem to optimize function. Adding up the metrics, we would like to consider the sum of execution time while reducing costs, and it has better performance. The the influence of all functions. To calculate the reward function, the reward function is analyzed as follows. minimum and maximum values are given below: Lmin = 80, Lmax = 100 • Increasing QoS and Utilization have increased the reward, because the system has better efficiency. Cmin = 0.15, Cmax = 0.50 • Reducing latency and cost has a positive effect on the reward, which indicates more optimal performance. Umin = 60, Umax = 80 • The highest reward value is observed in Experiment 8 (0.91), which indicates the optimal balance between QoS, Utilization, latency, and Qmin = 40, Qmax = 70 cost. • The lowest reward value is recorded in Experiment 6 (0.03), which is The reward function for different data in the experiment is shown in due to the increase in latency and the decrease in system efficiency Table 7, (Latency in ms, Cost in USD, QoS %, Utilization %, Reward (Utilization). score). Fig. 4. Impact of learning rate on performance metrics. 13 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 Table 6 Comparison of learning rate effects on performance metrics. Learning Rate (LR) Latency Cost QoS Utilization • 0.01 (high) • Relatively high, unstable, converges to moderate level • Lowest cost • Moderate QoS • Highest utilization • 0.001 (medium) • Lowest latency (best) • Highest cost • Highest QoS • Reduced utilization • 0.0001 (low) • Moderate, slow convergence • Moderate cost • Moderate QoS • Moderate utilization Fig. 5 shows the reward function graph. X-axis: Experiment ID (1–8). but lacking real-time optimization capabilities. Auto Opt Mem in­ Y-axis: Reward value (normalized) tegrates reinforcement learning to dynamically adjust memory alloca­ Fig. 6 shows Comparison of reward function and other metrics in tions, ensuring optimal performance across diverse execution different experiments. environments without requiring manual intervention. This section assesses the proposed Auto Opt Mem in a realistic • Comparison of latency and reward: Fig. 6a shows that decreasing serverless simulation environment. Four workloads were modeled as latency generally leads to increasing reward. The reward value is representatives, including ML inference, API gateway, data processing, higher at lower latencies (such as 84 and 85 milliseconds), but de­ and video processing, each independently defined by its memory range, creases at higher latencies. baseline latency, and cost functions. A neural network with two hidden • Comparison of cost and reward: Fig. 6b shows that while costs may layers (32 and 16 neurons) was used to train the PPO agent for 80 epi­ be decreasing, the rewards are increasing, indicating a potentially sodes. The reward function incorporates normalized latency, cost, and favorable outcome for the experiments. QoS terms that drive the policy to achieve a balanced optimization. An • Comparison of quality of service and reward: Fig. 6c shows that autonomic MAPE-K (Monitor, Analyze, Plan, Execute – Knowledge) loop increasing QoS usually increases reward. Quality of service in the was implemented at runtime to make the system self-adaptive. MAPE range of 70–72 % has the highest reward value. continuously monitors the recent performance, analyzes QoS/cost • Comparison of utilization and reward: Fig. 6d shows that higher trends, plans corrective actions such as tuning the memory, and executes utilization usually leads to increased reward. The reward value in­ them through the adjustment of PPO policy parameters. The results are creases sharply for values of 80 % and above. summarized in Table 8. Auto Opt Mem demonstrates superior improvements across all These graphs show that the optimized system tends to reduce la­ metrics compared to existing research, establishing its effectiveness. tency, reduce cost, increase quality of service, and improve efficiency or Fig. 7 shows the Comparison with previous studies. utilization to obtain maximum reward. The suggested reward function, Table 8 reports the average performance achieved by Auto Opt Mem which incorporates delay, cost, utilization, and QoS, can effectively across all benchmark functions in terms of latency reduction, cost sav­ balance the system’s various goals. Experiment 8 with the highest ings, and QoS improvement. These values are directly compared with reward of 0.91 depicts the optimal balance between all the objectives. the results of Sizeless [17] and FnCapacitor [18]. Results demonstrate Experiment 6 with the lowest reward of 0.03 is worse due to the high that, on average, the proposed Auto Opt Mem achieves 25–30 % less delay and low utilization. The comparison graphs in Fig. 6 clearly show latency, 15–18 % less cost, and 10–12 % more QoS when compared to that the improvement in each of the criteria, such as reducing delay and the Sizeless [17] baseline, while outperforming FnCapacitor [18] in all cost, or increasing QoS and utilization, leads to an increase in reward. key metrics. This proves that PPO-based MAPE-K autonomic loop can This means that the reward function design is suitable and can be used as dynamically adjust for the workload variability and provide optimized a criterion for multi-objective optimization. resource allocation in serverless environments. The results of 8 experiments conducted with different metrics are 5.3.3. Third scenario: comparison with previous studies presented in Table 9. Synthetic yet reproducible data were used, and the To validate our results, we compared Auto Opt Mem with two papers dataset is openly provided for replication. [17] and [18] that tackled memory optimization in serverless Table 10 summarizes the statistical comparison of Auto Opt Mem computing. When compared to the works of Eismann et al. [17] and with the baseline methods. For each metric, the mean, standard devia­ Jindal et al. [18], Auto Opt Mem achieves a more substantial reduction tion, and 95 % confidence interval were calculated. For instance, Auto in execution latency, a greater cost reduction, and a significant Opt Mem has a latency of 267.92 ms ± 1.65 with a 95 % CI of (266.74, improvement in QoS. Unlike previous studies that primarily relied on 269.10), outperforming both Sizeless [17] and FnCapacitor [18]. static function profiling or statistical estimations, Auto Opt Mem For statistical significance, we use an independent two-sample t-test. continuously learns and adapts to varying workloads, and thus is more Indeed, our results indicate that the improvements of Auto Opt Mem adaptive and scalable. Eismann et al. [17] developed a model for over Sizeless [17] are statistically significant for all metrics (p < 0.02). resource prediction based on monitoring a single memory size, For FnCapacitor [18], the cost differences are significant with p = 0.035. achieving performance gains but with limited adaptability to dynamic The statistical analysis confirms that Auto Opt Mem provides workloads. Jindal et al. [18] introduced a statistical and deep learning consistently better performance, with several improvements being sig­ approach to estimate function capacity, improving resource efficiency nificant at the 95 % confidence level. The average latency, average cost, and average quality of service in different methods are shown in the graph in Fig. 8. Table 7 In Fig. 8a, Auto Opt Mem (green line) has the lowest latency in all Reward function values in different experiments. tests, indicating better optimization in memory allocation and faster Experiment Latency (ms) Cost ($) QoS ( %) Utilization ( %) Reward execution of serverless functions. 1 90 0.17 66 75 0.11 In Fig. 8b, right, Auto Opt Mem minimizes the cost of executing 2 88 0.16 63 78 0.43 functions in all experiments. Jindal et al. [18] method reduces the cost 3 92 0.18 67 74 0.8 compared to Eismann et al. [17] but is still higher than Auto Opt Mem. 4 87 0.15 70 80 0.65 In Fig. 8c, Auto Opt Mem (green line) has the highest Quality of 5 89 0.17 64 76 0.21 6 91 0.16 68 73 0.03 Service (QoS), indicating increased reliability and optimal performance 7 86 0.15 71 79 0.65 under different workload conditions. Jindal et al. [18] method provides 8 85 0.14 72 82 0.91 better QoS than Eismann et al. [17] but still falls short of Auto Opt 14 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 Fig. 5. Reward function values for eight experimental runs. Mem’s. Table 11 further illustrates the differences of our approach from with regards to, for example, memory-to-vCPU coupling and pricing previous methods. scheme, which closely aligns with Azure Functions and Google Cloud As compared to static allocation strategies, Auto Opt Mem saves a lot Functions. Auto Opt Mem is provider-agnostic since platform-specific of resource wastage by memory allocation according to actual function constraints, such as memory ranges and CPU scaling rules, are requirements rather than over-provisioning. Such dynamic allocation embedded in the state representation. Thus, even though execution was leads to cost saving in a large extent, especially in the case of varying simulated, the framework is transferable to real AWS Lambda de­ workload or mixed kinds of functions. Moreover, the ability of Auto Opt ployments and other cloud providers as well. Mem to minimize latency leads to improved application performance and user experience. By efficient memory management and reducing 6.2. Potential application scenarios cold start time, Auto Opt Mem promises that functions will execute both quickly and reliably even in high-demand situations. In contrast to Due to resource and space limitations, we used controlled micro- heuristic methods that typically operate on oversimplification assump­ benchmarks. Still, Auto Opt Mem is not restricted to this setup and tions and struggle with dynamic states, Auto Opt Mem employs deep can also be used in real applications. For example, it helps with machine reinforcement learning to discover decisions from real current states. learning inference tasks, such as running image or text classification Such responsiveness is important in numerous serverless computing models where reducing cost and latency is important. It is also useful for applications. While machine learning-based methods are somewhat video transcoding, since converting formats with tools such as ffmpeg flexible, Auto Opt Mem is better at balancing conflicting objectives, e.g., usually takes a lot of CPU and memory. Another case is data pre­ cost minimization and optimal performance. By means of a reward processing, where large datasets need to be read, compressed, or function that considers several parameters, Auto Opt Mem gives an filtered. Auto Opt Mem also works well in API aggregation, when several overall optimization of the system. external services are called at the same time and the process is mostly I/ O-bound. 6. Discussion These examples illustrate that Auto Opt Mem is workload-agnostic and can be applied to real-world scenarios. A full-scale practical eval­ In this section, we discuss about the AWS lambda platform, potential uation is considered for future research. Table 12 shows real-world application scenarios, and then provide explanations about the scenarios where Auto Opt Mem can be used. comparative analysis on the memory configuration in serverless computing. 6.3. Comparative analysis 6.1. Use of AWS Table 13 summarizes recent research in intelligent cloud computing that focuses on automation, optimization, and deep learning. This table The experiments were conducted in a simulated environment; the shows how the proposed Auto Opt Mem framework aligns with these workload behavior, latency patterns, and cost models were derived from studies and focuses on dynamic memory and performance optimization documented AWS Lambda configuration rules. This ensures that the in serverless environments. characteristics of a real AWS serverless environment are captured while still allowing full reproducibility. 7. Conclusions AWS Lambda serves as the conceptual reference platform due to its very high market share and representative resource allocation model Memory configuration in serverless computing can be challenging 15 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 Fig. 6. Comparison of reward function and other metrics. as Auto Opt Mem. The results show that Auto Opt Mem optimizes Table 8 resource utilization, decreases operation costs and latency, and en­ Comparison with previous studies. hances quality of service (QoS), and hence it can be a perfect fit for Approach Latency Cost Reduction QoS Improvement developers in serverless systems. Auto Opt Mem shows noticeable im­ Reduction ( %) ( %) ( %) provements over previous methods. Compared to Sizeless, it reduces Auto Opt Mem vs 25–30 % 15–18 % 10–12 % latency by 25–30 %, lowers cost by 15–18 %, and improves QoS by Sizeless [17] 10–12 %. Against FnCapacitor, it achieves 5–7 % latency reduction, 6–8 Auto Opt Mem vs 5–7 % 6–8 % 2–3 % FnCapacitor [18] % cost reduction, and 2–3 % QoS improvement. Our experiments demonstrate that Auto Opt Mem On average provides 16.8 % lower la­ tency, 11.8 % cost reduction, and 6.8 % QoS improvement across both due to the ephemeral nature of serverless functions, which are short- methods. lived and stateless. This research examines memory configuration In our approach one of the important hyperparameters in deep mechanisms and then classifies these mechanisms in serverless reinforcement learning is the learning rate, which controls the extent of computing into three main approaches: machine learning-based, updates to the model’s weights. A high learning rate typically causes the exploration-based, and framework-based approaches. The advantages model to update its weights more aggressively. A higher learning rate and disadvantages of each mechanism, as well as the challenges and may cause inefficient memory usage because of unstable learning, performance metrics affecting their effectiveness, are discussed. Mem­ leading to suboptimal allocation. But, a very small learning rate could ory configuration is one of the important challenges in serverless lead to very slow training, Long-term training and slow convergence, computing; In this paper, we propose an autonomous deep learning- leading to late optimization of memory. It can be said that a high based serverless computing memory optimization system, referred to learning rate can achieve convergence quickly, but it may pass the 16 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 Fig. 7. Comparison with previous studies. real-time adaptability under more diverse and large-scale workloads. Table 9 Additionally, the scalability of Auto Opt Mem can also be researched on Comparison of approaches across 8 experiments. other serverless frameworks except AWS Lambda. Approach Latency (ms) Cost ($) QoS ( %) Sizeless [17] 362.85, 370.14, 0.00243, 0.00236, 80.15, 81.24, Funding 355.92, 380.18, 0.00248, 0.00244, 79.87, 80.45, 364.77, 359.60, 0.00241, 0.00237, 81.06, 80.83, Funding was received for this work. 372.31, 368.54 0.00246, 0.00242 79.92, 80.61 FnCapacitor 278.42, 271.66, 0.00256, 0.00252, 89.44, 90.22, All of the sources of funding for the work described in this publica­ [18] 282.10, 276.74, 0.00257, 0.00255, 88.97, 89.75, tion are acknowledged below: 274.89, 280.13, 0.00251, 0.00253, 90.13, 89.61, [List funding sources and their role in study design, data analysis, and 273.80, 275.55 0.00259, 0.00254 89.88, 90.06 result interpretation] Auto Opt 267.91, 270.34, 0.00221, 0.00219, 91.08, 91.46, No funding was received for this work. Mem 265.10, 268.05, 0.00223, 0.00220, 90.83, 91.21, 266.83, 269.14, 0.00218, 0.00222, 91.37, 90.97, 267.54, 266.41 0.00221, 0.00220 91.32, 91.15 Intellectual property We confirm that we have given due consideration to the protection of Table 10 intellectual property associated with this work and that there are no Statistical comparison of Auto Opt Mem with baseline methods. impediments to publication, including the timing of publication, with respect to intellectual property. In so doing we confirm that we have Approach Metric Average Std. 95 % CI p-value Dev. followed the regulations of our institutions concerning intellectual property. ​ ​ ​ ​ ​ vs. Auto Opt Mem Sizeless [17] Latency 366.29 8.18 (360.43, 0.002 Research ethics (ms) 372.15) ​ Cost ($) 0.00243 0.00004 (0.00240, 0.018 We further confirm that any aspect of the work covered in this 0.00246) manuscript that has involved human patients has been conducted with ​ QoS ( %) 80.52 0.43 (80.22, 0.001 80.82) the ethical approval of all relevant bodies and that such approvals are FnCapacitor Latency 276.29 3.29 (273.97, 0.120 acknowledged within the manuscript. [18] (ms) 278.61) IRB approval was obtained (required for studies and series of 3 or Cost ($) 0.00255 0.00003 (0.00253, 0.035 ​ more cases) 0.00257) ​ QoS ( %) 89.88 0.41 (89.59, 0.280 Written consent to publish potentially identifying information, such 90.17) as details or the case and photographs, was obtained from the patient(s) Auto Opt Latency 267.92 1.65 (266.74, – or their legal guardian(s). Mem (ms) 269.10) Cost ($) 0.00221 0.00002 (0.00220, Authorship ​ – 0.00222) ​ QoS ( %) 91.17 0.24 (91.00, – 91.34) The International Committee of Medical Journal Editors (ICMJE) recommends that authorship be based on the following four criteria: optimal point. As a result, it will lead to irregular updates and require 1. Substantial contributions to the conception or design of the work; or additional iterations to correct errors, which will increase memory the acquisition, analysis, or interpretation of data for the work; AND consumption. And a low learning rate helps in more stable and gradual 2. Drafting the work or revising it critically for important intellectual convergence, but may require more periods to reach convergence, which content; AND can increase the memory load due to long-term storage of intermediate 3. Final approval of the version to be published; AND states. Future research directions include extending Auto Opt Mem to multi-cloud environments, extending and stress-testing Auto Opt Mem’s 17 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 Fig. 8. Compare performance metrics. 4. Agreement to be accountable for all aspects of the work in ensuring been approved by all named authors. that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. Contact with the editorial office All those designated as authors should meet all four criteria for The Corresponding Author declared on the title page of the manu­ authorship, and all who meet the four criteria should be identified as script is: authors. For more information on authorship, please see https://www. [Mostafa Ghobaei-Arani] icmje. This author submitted this manuscript using his/her account in org/recommendations/browse/roles-and-responsibilities/defining-th EVISE. e-role-of-authors-and-contributors.html#two. We understand that this Corresponding Author is the sole contact for All listed authors meet the ICMJE criteria. We attest that all authors the Editorial process (including EVISE and direct communications with contributed significantly to the creation of this manuscript, each having the office). He/she is responsible for communicating with the other fulfilled criteria as established by the ICMJE. authors about progress, submissions of revisions and final approval of One or more listed authors do(es) not meet the ICMJE criteria. proofs. We believe these individuals should be listed as authors because: We confirm that the email address shown below is accessible by the [Please elaborate below] Corresponding Author, is the address to which Corresponding Author’s We confirm that the manuscript has been read and approved by all EVISE account is linked, and has been configured to accept email from named authors. the editorial office of American Journal of Ophthalmology Case Reports: We confirm that the order of authors listed in the manuscript has Someone other than the Corresponding Author declared above 18 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 Table 11 submitted this manuscript from his/her account in EVISE: Differentiation of our approach from previous studies. [Insert name below] Aspect Sizeless [17] FnCapacitor [18] Our Work We understand that this author is the sole contact for the Editorial process (including EVISE and direct communications with the office). ​ ​ ​ (Auto Opt Mem) Methodology • Multi-target • Sandboxing, • Deep He/she is responsible for communicating with the other authors, regression using performance tests, Reinforcement including the Corresponding Author, about progress, submissions of monitoring data and statistical/ Learning with MAPE revisions and final approval of proofs. from a single DNN modeling control loop memory size Decision • Predicts execution • Estimate’s • Learns and selects CRediT authorship contribution statement Type time and cost for function capacity optimal memory other memory sizes (max concurrency configuration Zahra Shojaee Rad: Resources, Methodology, Investigation, Fund­ under SLO) ing acquisition, Formal analysis, Data curation, Conceptualization. Adaptivity • Static once • Requires offline • Dynamic and trained, no profiling for continuous Mostafa Ghobaei-Arani: Writing – review & editing, Writing – original continuous learning changes adaptation at draft, Visualization, Validation, Supervision, Software, Resources, runtime Project administration. Reza Ahsan: Writing – review & editing, Focus • Memory- • Function • Memory Writing – original draft. performance trade- capacity and optimization offs concurrency balancing latency, cost, QoS, and utilization Declaration of competing interest Limitation • No runtime • Re-profiling — adaptability needed for new Potential conflict of interest exists: workloads Innovation • Efficient • Accurate FC • Self-adaptive and We wish to draw the attention of the Editor to the following facts, prediction with estimation for autonomous which may be considered as potential conflicts of interest, and to sig­ limited input functions optimization nificant financial contributions to this work: The nature of potential conflict of interest is described below: No conflict of interest exists. Table 12 The authors declare that they have no known competing financial Application Scenarios for Auto Opt Mem. interests or personal relationships that could have appeared to influence Scenario Type of Workload Role of Auto Opt Mem the work reported in this paper. • ML inference • CPU-bound • Optimizes latency and cost by Data availability adjusting memory/CPU • Video transcoding • CPU & memory- • Balances higher memory cost with intensive faster execution time Data will be made available on request. • Data preprocessing • Mixed (CPU + I/ • Adapts memory allocation based on (ETL) O) input size • API aggregation • I/O-bound • Keeps memory low while ensuring References QoS in parallel API calls [1] Ioana Baldini, Paul Castro, Kerry Chang, Perry Cheng, Stephen Fink, Vatche Ishakian, Nick Mitchell, et al., Serverless computing: current trends and open problems, Res. Adv. Cloud Comput. (2017) 1–20. [2] A. Ebrahimi, M. Ghobaei-Arani, H. Saboohi, Cold start latency mitigation Table 13 mechanisms in serverless computing: taxonomy, review, and future directions, Comparative analysis with recent studies in the field of intelligent cloud J. Syst. Archit. 151 (2024) 103115. computing. [3] AWS, “Serverlessvideo: connect with users around the world!.” https://serverless land.com/, 2023. Ref Focus Area Technique Key Relation to [4] AWS, “Serverless case study - netflix.” https://dashbird.io/blog/serverless-case-s Contribution Present Work tudy-netflix/, 2020. [59] Secure data Convergent Reduces Our DRL approach [5] CapitalOne, “Capital one saves developer time and reduces costs by going deduplication encryption redundancy and similarly targets serverless on aws.” https://aws.amazon.com/solutions/case-studies/capital-on ensures secure resource e-lambda-ecs-case-study/, 2023. cloud storage efficiency but in [6] E. Johnson, “Deploying ml models with serverless templates.” https://aws.amazon. com/blogs/compute/deploying-machine-learning-models-with-serverless-tem serverless memory plates/, 2021. optimization [7] A. Sojasingarayar, “Build and deploy llm application in aws.” https://medium. [60] Cloud key Machine Intelligent key Both emphasize com/@abonia/build-and-deploy-llm-application-in-aws-cca46c662749, 2024. management learning- lifecycle intelligent [8] A. Gholami, M. Ghobaei-Arani, A trust model based on quality of service in cloud based security management automation for computing environment, Int. J. Database Theor. Appl. 8 (5) (2015) 161–170, framework secure and https://doi.org/10.14257/ijdta.2015.8.5.13. efficient cloud [9] DataDog. 2020. The State of Serverless. https://www.datadoghq.com/state-of-se operations rverless/. [61] Deep learning Deep learning Provides insight Inspires our DL- [10] M. Tari, M. Ghobaei-Arani, J. Pouramini, M. Ghorbian, Auto-scaling mechanisms in for cloud/ models survey into intelligent driven resource serverless computing: a comprehensive review, Comput. Sci. Rev. 53 (2024) edge/fog/IoT distributed optimization 100650, https://doi.org/10.1016/j.cosrev.2024.100650. learning framework [11] M. Ghorbian, M. Ghobaei-Arani, R. Asadolahpour-Karimi, Function placement paradigms approaches in serverless computing: A survey, J. Syst. Archit. 157 (2024) 103291. [12] B. Jacob, R. Lanyon-Hogg, D.K. Nadgir, A.F. Yassin, A Practical Guide to the IBM [62] Cloud security Deep Enhances Our work extends Autonomic Computing toolkit. IBM, International Technical Support Organization, and privacy learning- privacy and this intelligence 2004. based attack adaptive threat toward [13] Michael Maurer, Ivan Breskovic, Vincent C. Emeakaroha, Ivona Brandic, Revealing detection response performance the MAPE loop for the autonomic management of cloud infrastructures, in: 2011 optimization and IEEE Symposium on Computers and Communications (ISCC), IEEE, 2011, QoS improvement pp. 147–152. in serverless [14] Russell, Stuart J., and Peter Norvig. Artificial intelligence: a modern approach. systems pearson, 2016. [15] Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore, Reinforcement learning: a survey, J. Artif. Intell. Res. 4 (1996) 237–285. 19 Z. Shojaee Rad et al. Computer Standards & Interfaces 97 (2026) 104098 [16] Rajkumar Rajavel, Mala Thangarathanam, Adaptive probabilistic behavioural [40] Zahra Shojaee rad, Mostafa Ghobaei-Arani, Reza Ahsan, Memory orchestration learning system for the effective behavioural decision in cloud trading negotiation mechanisms in serverless computing: a taxonomy, review and future directions, market, Fut. Gener. Comput. Syst. 58 (2016) 29–41. Cluster. Comput. (2024) 1–27. [17] Simon Eismann, Long Bui, Johannes Grohmann, Cristina Abad, Nikolas Herbst, [41] R. Wolski, C. Krintz, F. Bakir, G. George, W.-T. Lin, Cspot: portable, multi-scale Samuel Kounev, Sizeless: predicting the optimal size of serverless functions, in: functions-as-a-service for iot, in: Proceedings of the 4th ACM/IEEE Symposium on Proceedings of the 22nd International Middleware Conference, 2021, pp. 248–259. Edge Computing (SEC ‘19). Association for Computing Machinery, New York, [18] Anshul Jindal, Mohak Chadha, Shajulin Benedict, Michael Gerndt, Estimating the 2019, pp. 236–249, https://doi.org/10.1145/3318216.3363314. capacities of function-as-a-service functions, in: In Proceedings of the 14th IEEE/ [42] V. Yussupov, U. Breitenbucher, F. Leymann, M. Wurster, A systematic mapping ACM International Conference on Utility and Cloud Computing Companion, 2021, study on engineering function-as-a-service platforms and tools, in: Proceedings of pp. 1–8. the 12th IEEE/ACM International Conference on Utility and Cloud Computing [19] Djob Mvondo, Mathieu Bacou, Kevin Nguetchouang, Lucien Ngale, (UCC’19). Association for Computing Machinery, New York, 2019, pp. 229–240, Stéphane Pouget, Josiane Kouam, Renaud Lachaize, et al., OFC: an opportunistic https://doi.org/10.1145/3344341.3368803. caching system for FaaS platforms, in: Proceedings of the Sixteenth European [43] Zahra Shojaee Rad, Mostafa Ghobaei-Arani, Federated serverless cloud approaches: Conference on Computer Systems, 2021, pp. 228–244. a comprehensive review, Comput. Electric. Eng. 124 (2025) 110372. [20] Myung-Hyun Kim, Jaehak Lee, Heonchang Yu, Eunyoung Lee, Improving memory [44] Ioana Baldini, Paul Castro, Kerry Chang, Perry Cheng, Stephen Fink, utilization by sharing DNN models for serverless inference, in: 2023 IEEE Vatche Ishakian, Nick Mitchell, et al., Serverless computing: current trends and International Conference on Consumer Electronics (ICCE), IEEE, 2023, pp. 1–6. open problems, Res. Adv. Cloud Comput. (2017) 1–20. [21] Agarwal, Siddharth, Maria A. Rodriguez, and Rajkumar Buyya. "Input-based [45] M. Elsakhawy, M. Bauer, Faas2f: a framework for autoining execution-sla in ensemble-learning method for dynamic memory configuration of serverless serverless computing, in: 2020 IEEE Cloud Summit, 2020, pp. 58–65, https://doi. computing functions." arXiv preprint arXiv:2411.07444 (2024). org/10.1109/IEEECloudSummit48914.2020.00015. [22] Gor Safaryan, Anshul Jindal, Mohak Chadha, Michael Gerndt, SLAM: sLO-aware [46] A.U. Gias, G. Casale, Cocoa: cold start aware capacity planning for function-as-a- memory optimization for serverless applications, in: 2022 IEEE 15th International service platforms, in: 2020 28th International Symposium on Modeling, Analysis, Conference on Cloud Computing (CLOUD), IEEE, 2022, pp. 30–39. and Simulation of Computer and Telecommunication Systems (MASCOTS), 2020, [23] Robert Cordingly, Sonia Xu, Wes Lloyd, Function memory optimization for pp. 1–8, https://doi.org/10.1109/MASCOTS50786.2020.9285966. heterogeneous serverless platforms with cpu time accounting, in: 2022 IEEE [47] C.K. Dehury, S.N. Srirama, T.R. Chhetri, Ccodamic: a framework for coherent International Conference on Cloud Engineering (IC2E), IEEE, 2022, pp. 104–115. coordination of data migration and computation platforms, Futur. Gener. Comput. [24] Tetiana Zubko, Anshul Jindal, Mohak Chadha, Michael Gerndt, Maff: self-adaptive Syst. 109 (2020) 1–16, https://doi.org/10.1016/j.future, 2020.03.029. memory optimization for serverless functions, in: European Conference on Service- [48] A. Tariq, A. Pahl, S. Nimmagadda, E. Rozner, S. Lanka, Sequoia: enabling quality- Oriented and Cloud Computing, Cham: Springer International Publishing, 2022, of-service in serverless computing, in: Proceedings of the 11th ACM Symposium on pp. 137–154. Cloud Computing (SoCC ’20). Association for Computing Machinery, New York, [25] Josef. Spillner, Resource management for cloud functions with memory tracing, 2020, pp. 311–327, https://doi.org/10.1145/3419111.3421306. profiling and autotuning, in: Proceedings of the 2020 Sixth International Workshop [49] J. Manner, S. Kolb, G. Wirtz, Troubleshooting serverless functions: a combined on Serverless Computing, 2020, pp. 13–18. monitoring and debugging approach, SICS Softw.-Intensiv. Cyber-Phys. Syst. 34 (2) [26] Zengpeng Li, Huiqun Yu, Guisheng Fan, Time-cost efficient memory configuration (2019) 99–104, https://doi.org/10.1007/s00450-019-00398-6. for serverless workflow applications, Concurr. Comput.: Pract. Exp. 34 (27) (2022) [50] J. Nupponen, D. Taibi, Serverless: what it is, what to do and what not to do, in: e7308, no. 2020 IEEE International Conference on Software Architecture Companion (ICSA- [27] Andrea Sabbioni, Lorenzo Rosa, Armir Bujari, Luca Foschini, Antonio Corradi, C), 2020, pp. 49–50, https://doi.org/10.1109/ICSA-C50368.2020.00016. A shared memory approach for function chaining in serverless platforms, in: 2021 [51] G. Cordasco, M. D’Auria, A. Negro, V. Scarano, C. Spagnuolo, Fly: a domain- IEEE Symposium on Computers and Communications (ISCC), IEEE, 2021, pp. 1–6. specific language for scientific computing on faas, in: U. Schwardmann, C. Boehme, [28] Aakanksha Saha, Sonika Jindal, EMARS: efficient management and allocation of B. Heras D, V. Cardellini, E. Jeannot, A. Salis, C. Schifanella, R.R. Manumachu, resources in serverless, in: 2018 IEEE 11th International Conference on Cloud D. Schwamborn, L. Ricci, O. Sangyoon, T. Gruber, L. Antonelli, S.L. Scott (Eds.), Computing (CLOUD), IEEE, 2018, pp. 827–830. Euro-Par 2019: Parallel Processing Workshops, Springer, Cham, 2020, [29] Amit Samanta, Faraz Ahmed, Lianjie Cao, Ryan Stutsman, Puneet Sharma, pp. 531–544. Persistent memory-aware scheduling for serverless workloads, in: 2023 IEEE [52] B. Jambunathan, K. Yoganathan, Architecture decision on using microservices or International Parallel and Distributed Processing Symposium Workshops serverless functionswith containers, in: 2018 International Conference on Current (IPDPSW), IEEE, 2023, pp. 615–621. Trends Towards Converging Technologies (ICCTCT), 2018, pp. 1–7, https://doi. [30] Meenakshi Sethunath, Yang Peng, A joint function warm-up and request routing org/10.1109/ICCTCT.2018.8551035. scheme for performing confident serverless computing, High-Confidence Comput. [53] A. Keshavarzian, S. Sharifian, S. Seyedin, Modified deep residual network 2 (3) (2022) 100071 no. architecture deployed on serverless framework of iot platform based on human [31] Anisha Kumari, Manoj Kumar Patra, Bibhudatta Sahoo, Ranjan Kumar Behera, activity recognition application, Futur. Gener. Comput. Syst. 101 (2019) 14–28, Resource optimization in performance modeling for serverless application, Int. J. https://doi.org/10.1016/j.future.2019.06.009. Inf. Technol. 14 (6) (2022) 2867–2875, no. [54] Gerald. Tesauro, Temporal difference learning and TD-gammon, Commun. ACM 38 [32] Vahldiek-Oberwagner, Anjo, and Mona Vij. "Meshwa: the case for a memory-safe (3) (1995) 58–68. software and hardware architecture for serverless computing." arXiv preprint [55] Eric Rutten, Nicolas Marchand, Daniel Simon, Feedback control as MAPE-K loop in arXiv:2211.08056 (2022). autonomic computing. Software Engineering for Self-Adaptive Systems III. [33] Divyanshu Saxena, Tao Ji, Arjun Singhvi, Junaid Khalid, Aditya Akella, Memory Assurances: International Seminar, Dagstuhl Castle, Germany, December 15-19, deduplication for serverless computing with medes, in: Proceedings of the 2013, Revised Selected and Invited Papers, Springer International Publishing, Seventeenth European Conference on Computer Systems, 2022, pp. 714–729. Cham, 2018, pp. 349–373. [34] Jie Li, Laiping Zhao, Yanan Yang, Kunlin Zhan, Keqiu Li, Tetris: memory-efficient [56] Evangelina Lara, Leocundo Aguilar, Mauricio A. Sanchez, Jesús A. García, Adaptive serverless inference through tensor sharing, in: 2022 USENIX Annual Technical security based on mape-k: a survey. Applied Decision-Making: Applications in Conference (USENIX ATC 22), 2022. Computer Sciences and Engineering, Springer International Publishing, Cham, [35] Dmitrii Ustiugov, Plamen Petrov, Marios Kogias, Edouard Bugnion, Boris Grot, 2019, pp. 157–183. Benchmarking, analysis, and optimization of serverless function snapshots, in: [57] Jeffrey O. Kephart, David M. Chess, The vision of autonomic computing, Computer. Proceedings of the 26th ACM International Conference on Architectural Support (Long. Beach. Calif) 36 (1) (2003) 41–50. for Programming Languages and Operating Systems, 2021, pp. 559–572. [58] Alistair McLean, Roy Sterritt, Autonomic Computing in the Cloud: an overview of [36] Ao Wang, Jingyuan Zhang, Xiaolong Ma, Ali Anwar, Lukas Rupprecht, past, present and future trends, in: The 2023 IARIA Annual Congress on Frontiers Dimitrios Skourtis, Vasily Tarasov, Feng Yan, Yue Cheng, {InfiniCache}: exploiting in Science, Technology, Services, and Applications: Technical Advances and ephemeral serverless functions to build a {cost-effective} memory cache, in: 18th Human Consequences, 2023. USENIX conference on file and storage technologies (FAST 20), 2020, pp. 267–281. [59] Shahnawaz Ahmad, Mohd Arif, Javed Ahmad, Mohd Nazim, Shabana Mehfuz, [37] Anurag Khandelwal, Yupeng Tang, Rachit Agarwal, Aditya Akella, Ion Stoica, Jiffy: Convergent encryption enabled secure data deduplication algorithm for cloud elastic far-memory for stateful serverless analytics, in: Proceedings of the environment, Concurr. Computat.: Pract. Exp. 36 (21) (2024) e8205. Seventeenth European Conference on Computer Systems, 2022, pp. 697–713. [60] Shahnawaz Ahmad, Shabana Mehfuz, Shabana Urooj, Najah Alsubaie, Machine [38] Nikolos, Orestis Lagkas, Chloe Alverti, Stratos Psomadakis, Georgios Goumas, and learning-based intelligent security framework for secure cloud key management, Nectarios Koziris. "Fast and efficient memory reclamation for serverless Cluster. Comput. 27 (5) (2024) 5953–5979. MicroVMs." arXiv preprint arXiv:2411.12893 (2024). [61] Shahnawaz Ahmad, Iman Shakeel, Shabana Mehfuz, Javed Ahmad, Deep learning [39] Zahra Shojaee Rad, Mostafa Ghobaei-Arani, Data pipeline approaches in serverless models for cloud, edge, fog, and IoT computing paradigms: survey, recent computing: a taxonomy, review, and research trends, J. Big. Data 11 (1) (2024) advances, and future directions, Comput. Sci. Rev. 49 (2023) 100568. 1–42, no. [62] Shahnawaz Ahmad, Mohd Arif, Shabana Mehfuz, Javed Ahmad, Mohd Nazim, Deep learning-based cloud security: innovative attack detection and privacy focused key management, IEEE Trans. Comput. (2025). 20