Files
opaque-lattice/papers_txt/Eliminating-duplicate-writes-of-logging-via-no-loggi_2025_Journal-of-Systems.txt
2026-01-06 12:49:26 -07:00

654 lines
70 KiB
Plaintext
Raw Permalink Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
Journal of Systems Architecture 160 (2025) 103347
Contents lists available at ScienceDirect
Journal of Systems Architecture
journal homepage: www.elsevier.com/locate/sysarc
Eliminating duplicate writes of logging via no-logging flash translation layer
in SSDs
Zhenghao Yin a , Yajuan Du a ,, Yi Fan a , Sam H. Noh b
a Wuhan University of Technology, Wuhan, 430070, Hubei Province, China
b
Virginia Tech, Blacksburg, 24061-0326, VA, USA
ARTICLE INFO ABSTRACT
Keywords: With the development of high-density flash memory techniques, SSDs have achieved high performance and
Flash memory large capacity. Databases often use logging to ensure transactional atomicity of data updates. However, it
Transaction introduces duplicate writes because of multi-versioning, which significantly weakens the performance and
Flash translation layer
endurance of SSDs. This is also often considered as the main reason for slow response of databases. This
Duplicate writes
paper proposes a novel flash translation layer (FTL) for SSDs, which we refer to as NoLgn-FTL, to reduce
the overhead of logging-induced duplicate writes by exploiting the inherent multi-version feature of flash
memories. Specifically, during a transaction, NoLgn-FTL retains the old data as valid and establishes the
mapping between the new physical addresses and the old physical addresses. Thus, the database can easily
roll back to the old-version data to maintain system consistency when a power failure occurs. To evaluate
NoLgn-FTL, we implement it within FEMU and modify the SQLite database and the file system to make them
compatible with the extended abstractions provided by NoLgn-FTL. Experimental results show that, in normal
synchronization mode, NoLgn-FTL can reduce SSD writes by 20% and improve database performance by 15%
on average.
1. Introduction To investigate the performance of database logging in SSD, this
paper first performs a preliminary study to collect latency that happens
Solid-state drives (SSDs) have been widely adopted in database sys- during WAL-based data updates. We find that WAL takes a larger
tems due to their high performance. Databases employ logging-based proportion of latency than regular data updates, especially for small
methods, such as write-ahead logging (WAL) and rollback journals, to data updates. This inspires us to design a direct update scheme to
ensure the transactional atomicity of multiple data updates. In these alleviate the overhead of duplicate writes by leveraging the out-of-
methods, data is first written to persistent logs before updating the place update feature of flash memory. This feature inherently maintains
original data, which induces duplicate writes [1]. For SSDs, duplicate multiple versions of data upon updates, allowing the database to easily
writes occur in the following manner. First, the updated data and roll back to the previous version of the data in the event of a power
metadata are written into log files in flash memory. Then, due to the failure or system crash, ensuring data consistency without the need for
inherent out-of-place update nature of the SSD [2], the updated data explicit logging.
is written into new flash pages rather than overwriting the original
This paper proposes a no-logging flash translation layer (NoLgn-
ones [3]. Thus, one user data write induces two SSD internal writes
FTL) by reusing old flash data pages. The key idea is to keep the
onto two different flash pages, increasing extra program/erase (P/E)
mapping information of old data during transactions, eliminating the
cycles. This reduces SSD lifespan and degrades overall performance by
need for separate log writes. We establish a mapping table between
consuming write throughput.
new and old physical addresses (called a P2P table) in the RAM of
To address the issue of SSD duplicate writes in logging-based
the flash controller. Meanwhile, the old physical address is written
databases, researchers have proposed data remapping methods. These
methods aim to convert logs directly into new data by modifying the into the out-of-band area of new flash pages, providing a backup
mapping between logical pages (LPs) and physical pages (PPs) in flash of the mapping information. In this way, uncommitted transactions
memory [4,5]. However, dealing with the inconsistency of logging and can be rolled back to the old data version upon power failure, thus
data LPs is challenging during power failures. maintaining consistency. We implement NoLgn-FTL within FEMU and
Corresponding author.
E-mail address: dyj@whut.edu.cn (Y. Du).
https://doi.org/10.1016/j.sysarc.2025.103347
Received 31 October 2024; Received in revised form 15 December 2024; Accepted 18 January 2025
Available online 25 January 2025
1383-7621/© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
evaluate it with the SQLite database. Experimental results show that, The write overhead incurred by WAL cannot be overlooked com-
in normal synchronization mode, NoLgn-FTL can reduce SSD writes by pared to directly updating the page. Multiple update operations may
20% and improve database performance by 15% on average, compared be performed on the same data page in the buffer, but during a
to existing methods. Our paper makes the following contributions. checkpoint, the storage engine writes the latest data page to a database
file. Fig. 2 illustrates the storage engine layer writing process. In the
• We conduct a preliminary study that reveals the significant la- example, two concurrent transactions, Transaction1 and Transaction2,
tency impact of logging, compared to pure data updates in modify the database. Transaction1 updates A and B with values 2
databases, motivating the need for a more efficient approach to and 4, while Transaction2 updates A and C with values 3 and 7.
handling duplicate writes. During the first step of the write merging process, the modifications
• We propose a novel SSD FTL, called NoLgn-FTL, which fully made by both transactions are recorded in the WAL file. The WAL file
utilizes the out-of-place update nature of flash memory to largely maintains separate regions for each transaction, capturing the updated
remove duplicate writes caused by database logging. page identifiers and their corresponding values. Consequently, the WAL
• We modify SQLite and integrate NoLgn-FTL in the FEMU simula- file contains two distinct entries: one for Transaction1, documenting
tor. We verify the efficiency of NoLgn-FTL in reducing duplicate the updates to pages A(2) and B(4), and another for Transaction2,
writes and improving database performance through extensive recording the updates to pages A(3) and C(7). In the second step, the
experiments. changes recorded in the WAL file are applied to the database during the
checkpointing process. As both transactions modify page A, the WAL
The rest of this paper is organized as follows. Section 2 introduces
mechanism merges these updates into a single write operation. The
the basics of SSDs and logging methods as well as the motivation of
WAL mechanism consolidates the updates and writes the final value
this paper. Section 3 presents the design of NoLgn-FTL. Section 4 shows
of page A(3) to the database file. A contains the merged value of 3,
the experimental setup and evaluation results of NoLgn-FTL. Section 5
while B and C hold 4 and 7.
reviews existing work, and Section 6 concludes this paper.
2.3. Existing solutions
2. Background and motivation
Existing works propose to exploit data remapping to eliminate
This section begins by introducing the basics of SSDs, with a focus
duplicate writes in SSDs [810]. The key design is not to remove the
on logging methods. Then, we present existing remapping-based meth-
out-of-place data update but to directly remap the WAL file to the
ods. Finally, we present the preliminary study as the motivation for this
new-version data, as shown in Fig. 1b.
paper.
However, address remapping can lead to mapping inconsistency.
Flash pages are divided into a data area for storing user data and
2.1. Basics of SSD an OOB area for maintaining metadata. The OOB area contains the
physical-to-logical (P2L) mappings, which are crucial for maintaining
Flash memory utilizes a flash translation layer (FTL) to store and data consistency during garbage collection and database recovery.
manage a logical-to-physical address translation, called L2P mapping. During garbage collection, the P2L mappings enable quick identifica-
This mapping is often stored in the SRAM internal to the SSD to achieve tion of the logical address corresponding to a physical address, which
high access performance. Meanwhile, the logical address is also stored accelerates the update of L2P mappings during data migration. During
in the out-of-band (OOB) area of physical flash pages. Upon a data recovery upon a system crash, the FTL can reconstruct the lost L2P
update request, the FTL first stores the new data in new flash pages and mapping table using the P2L mapping stored within the page.
invalidates the old flash pages. Meanwhile, the L2P mapping is directed Without remapping, the P2L mappings in the OOB area directly
to the new physical page addresses, and the requested logical addresses correspond to the LPN in the L2P mapping table. However, mapping
are also stored in the OOB areas as the new flash pages are written. The inconsistencies may arise after remapping because remapping opera-
invalidated old pages are reclaimed during garbage collection (GC). tions do not simultaneously update the related P2L mappings in the
As shown in Fig. 1a, when data with physical addresses P1, P2, and OOB area.
P3 need to be updated, new data would eventually be stored in new
physical pages P1 , P2 , and P3 . (Note L𝑖 and P𝑖 in the figure represent 2.4. Preliminary study and motivation
the logical address and physical addresses).
To investigate the performance of database transactions, we conduct
2.2. Write ahead logging preliminary experiments using the FEMU simulator [11], which is
discussed in more detail in Section 4.
Relational databases are typically run in rollback mode or write- We run the SQLite database, perform 1 million overwrite operations
ahead log mode in order to support atomic execution of transactions [1, for each fixed value size, and collect the transaction latency under four
6,7]. New updates are first written in a dedicated log, and the data value sizes. In Fig. 3, the 𝑥-axis represents the transaction value size and
is kept consistent by rolling back or forwarding to the log. How- the 𝑦-axis represents the percentage of the time spent on WAL writes,
ever, using logs often generates write amplification, affecting database WAL synchronization, data writes, and data synchronization.
performance. Write-ahead logging (WAL) serves as an example. A From Fig. 3, we observe that WAL (WAL write and WAL synchro-
WAL-based transaction update includes three steps: WAL writing, WAL nization) takes up a significant portion of the total transaction latency.
synchronization, and database writing, as shown in Fig. 1a. First, when Compared to the data (data write and data synchronization) operations,
a transaction is initiated, the new data are written into the page cache the proportion is significantly higher for small value sizes, while for the
of WAL files (Step 1). Upon transaction commit, the WAL files are 16 KB size, the two are comparable.
physically written to flash memory (WAL synchronization) (Step 2). Two main factors contribute to this phenomenon. Firstly, WAL
Finally, the database data is updated during system checkpointing. As introduces additional overhead by writing an extra frame header for
this checkpoint is performed at the database software level, WAL data each transaction. This header contains essential recovery information
cannot be directly moved into the database data. Thus, the WAL file is and is stored alongside the normal data. Consequently, the relative
read again into the page cache (Step 3) and written into flash memory overhead of the frame header becomes more significant for smaller
upon database synchronization (Step 4). Duplicated writes introduced transactions. Secondly, although WAL consolidates multiple updates to
by WAL are detrimental to flash memory endurance and performance. the same data pages into a single write operation during checkpointing,
2
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
Fig. 1. Existing write-ahead logging schemes in SSDs.
Fig. 3. Transaction latency distribution in SQLite database.
Fig. 2. Multi-version pages in the WAL.
the logging mechanism still necessitates storing multiple versions of 3.1. Overview
the same data in log files. It results in increased storage requirements,
particularly affecting smaller transactions with frequent updates on the We propose NoLgn-FTL, a novel approach that optimizes both soft-
same page, as the overhead of maintaining multiple versions becomes ware and hardware architectures to efficiently manage transactions and
more significant relative to the size of the transactions. data version control at the FTL layer, thereby avoiding the overhead of
This paper proposes a novel approach by directly updating data and logs in databases. At the core of NoLgn-FTL is the novel FTL, where
leveraging the inherent multi-version characteristic of flash memory. transaction information is utilized to perform mapping conversion of
Shifting the focus of transaction support to flash can reduce the reliance logical and physical addresses in the L2P and P2P tables only when
on logs and frequent file synchronization operations in the database. data is written, minimizing overhead. However, the use of NoLgn-
This leads to faster application response times as it reduces the need FTL starts at the database layer where the transaction information is
for excessive logging and synchronization. attached to write requests. The file system layer also plays a crucial role
by providing transaction-related interfaces and transmitting necessary
transactional metadata.
3. The proposed NoLgn-FTL Fig. 4 shows the overall workflow with an example of transactional
data update on three pages in L1, L2, and L3. The process is divided
We first introduce the overview of the whole system flow using an into three key stages: transaction delivery, transaction persistence, and
no-logging flash translation layer, which, hereafter, we simply refer to GC. These stages can be further subdivided into six steps.
as NoLgn-FTL. Then, we delve into the design details of NoLgn-FTL, First, the database assigns transaction flags to each transaction (⃝ 1
including old page information storage, transaction process, garbage in Fig. 4) to indicate the completion status of the transaction. Then, a
collection (GC), and data recovery. Without loss of generality, the SQL transaction ID is added to the original transactional data request (⃝). 2
database is used in discussing the use of NoLgn-FTL. Finally, we analyze To retain transaction flags and IDs, we design new interfaces in the file
and discuss the overhead associated with NoLgn-FTL. system (⃝).3
3
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
Fig. 4. Overview of NoLgn-FTL.
In the second stage, which occurs within the SSDs, the flash con- of the old pages are also stored in the OOB area of the new flash pages.
troller identifies transaction data by transaction flags and IDs. Data and The primary purposes of the P2P table are twofold: firstly, to facilitate
transaction information are persisted, obtaining their corresponding the management of transactional information by the underlying FTL,
physical addresses. The old addresses and transaction information are and secondly, to enhance the performance during GC and transaction
written in the OOB area of the corresponding flash pages, as well as in operations. Note that locating old pages can be accelerated by using the
the P2P table in DRAM (⃝). 4 The old pages remain valid in this step P2P table, thereby avoiding frequent access on flash pages to the OOB
but will be invalidated only after the transaction is committed (⃝). 5 area. This table does not need to be written to flash memory and can be
As transactions are continuously executed, a large amount of invalid recovered through a full scan even after a sudden power failure, thus
data accumulates in the flash memory. The GC process (⃝) 6 reclaims the avoiding frequent writes of transaction information to flash memory.
invalid data. The collaboration between the database, file system, and Furthermore, transaction information, including transaction IDs and
flash controller in NoLgn-FTL ensures data consistency and integrity flags, is stored in the OOB area of new flash pages. In detail, flags S,
throughout the transactional data update process. M, and E represent the starting page, the middle pages, and the end
The modified file system interfaces play a crucial role in preserving page of a transaction, respectively. In the implementation of transac-
the necessary transaction metadata. The design of NoLgn-FTL in the tion flags, since we are only concerned whether the transaction has
above-mentioned three main stages will be presented in Sections 3.2,
ended, we use only one bit to mark the transactions completion. By
3.3, and 3.4.
storing transaction information alongside the corresponding pages, the
progress and state of transactions can be more effectively tracked,
3.2. Metadata management in transaction delivery
enabling data recovery in case of unexpected failures or interruptions.
Database recovery will be explained in Section 3.5.
In the transaction delivery process, we introduce additional meta-
In addition to transaction information, one extra bit, referred to
data to facilitate the implementation of the no-logging scheme. This
as the lock bit, is used to indicate the block lock state. The lock bit
metadata is passed along with the transactional data requests to en-
value 1 signifies that valid old pages exist in the current block, while
sure proper handling and management of transactions throughout the
0 indicates the block is stale and can be reclaimed during GC. By
system.
embedding the lock bit within the FTL, blocks containing valid old
In the FTL, we establish a physical-to-physical (P2P) table that
pages and normal blocks can be efficiently distinguished, allowing for
stores the mapping between new and old physical pages (i.e., their old
GC optimization. The GC process under NoLgn-FTL will be presented
version). In detail, one entry in the P2P table includes the transaction
in Section 3.4.
ID, the physical page number (PPN) of the new page and the PPN of the
corresponding old page. To ensure persistent P2P mappings, the PPNs
4
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
3.3. Transaction persistence in NoLgn-FTL P2P Table Storage and Overhead: The P2P table is stored in the RAM
of the flash controller. The number of entries in the P2P table depends
To ensure transaction persistence, the transaction needs to do the on the number of concurrent transactions. In our experiment, the table
following during its write and commit process. During transaction contains 10 000 entries. Each P2P entry takes 12 bytes, including a 4-
writing, NoLgn-FTL first looks up the original L2P table to find the byte transaction ID and 4 bytes each for the new page PPN and the
old PPN corresponding to the requested logical addresses. As shown in old page PPN. The total size of the P2P table is about 120 KB. The
1
Fig. 4, the old PPNs are P1, P2, and P3 for the requested L1, L2, and L3, DRAM size is usually around 1024 of the SSD capacity. For an SSD with
respectively. Then, the updated data are written into the new pages P1 , a 1TB capacity, the DRAM size will be 1 GB, and the P2P table will be
P2 , and P3 , respectively. At the same time, transaction information 0.12 MB, which is only 0.012% of the DRAM size and is negligible. The
and the old PPN are written into the OOB area of these new pages. block lock state is stored in the metadata of data blocks as a bitmap,
Finally, NoLgn-FTL stores the mapping entry of P1, P2, and P3 into the with each block requiring only 1 bit, which is insignificant in terms of
P2P table. Different from the original flash write, the old page remains overhead. This lock bit is loaded into the SSDs DRAM during startup.
valid. Meanwhile, the blocks lock state containing valid old pages is Transaction Information Storage in OOB Area: Transaction informa-
set to 1. tion is stored in the OOB area of flash pages. NoLgn-FTL uses 4 bytes for
During transaction commit, NoLgn-FTL first searches the P2P table old PPNs and 4 bytes for transaction information (comprising the trans-
to find old valid pages and then invalidates them. Then, the blocks lock action ID and 1 bit for transaction flag). In current flash chips, the ratio
state containing these old valid pages would be set to 0. Finally, the of the OOB area size to the data area size is about 18 [12]. Therefore,
corresponding entries in the P2P table are deleted. the OOB area has enough space to store transaction information.
3.4. Garbage collection with NoLgn-FTL 4. Evaluation
GC in NoLgn-FTL requires handling valid old pages temporarily In this section, we present a comprehensive evaluation of NoLgn-
generated during transaction processing. Selecting a victim block for FTL, using an SQLite and Ext4 combination as a case study. We first
describe the experimental setup. Then, we present the sqlite-bench
GC involves several steps to ensure data integrity and efficient space
experimental results, focusing on two key aspects: flash write and
reclamation.
database performance. We also investigate the impact of NoLgn-FTL
When selecting a victim block for GC, the first step is to check the
on GC. Furthermore, we show the performance of real-world workloads
blocks lock state. If the lock state is 1, valid old pages still exist within
with the YCSB and TPC-C benchmarks.
the block, and therefore, the block cannot be reclaimed. At this time,
the next victim block in the queue is selected until the selected blocks
4.1. Experimental setup
lock state is 0. Then, whether there is a transaction page in the block
must be checked. As the transaction information and old PPN are stored
NoLgn-FTL is implemented on FEMU [1315], a QEMU-based NVMe
in the OOB area of the new valid pages, GC in NoLgn-FTL deals with
SSD emulator. The host system kernel of FEMU is Linux 5.15, and the
them differently depending on the transaction state. That is, before the
file system is Ext4. To ensure a representative and consistent setup, the
transaction is committed, GC will migrate these valid pages together
simulated SSD has a 16 GB logical capacity, with 1024 pages per flash
with the OOB area. However, after a commit has occurred, GC only
block and a 4 KB page size. The flash latency for read, write, and erase
migrates valid page data, removing the extra metadata of NoLgn-FTL
operations is 50 μs, 500 μs, and 5 ms, respectively [16]. To ensure the
that resides in the OOB area.
GC (Garbage Collection) mechanism is appropriately triggered during
our experiments, we conducted 4 million 4 KB write operations on the
3.5. Database recovery with NoLgn-FTL
SSD in each test. This setup guarantees that GC operations occur as part
of the evaluation.
In the event of a power-off or system crash, data stored in the For the logging database, we make use of SQLite. We make nec-
flash controllers RAM is lost, and only the OOB area of flash pages essary modifications to the Linux kernel to receive and process trans-
can be used for system recovery. One solution is to recover to the action information from the SQLite database. To enable SQLite to
consistent states in the latest checkpoint, which requires periodically transmit transaction information to the kernel, we utilize the ioctl
storing checkpoints. The other solution involves a full flash scan to system call to change database write, commit, and abort operations into
rebuild mappings, as shown in Step 1 of Fig. 5. Physical pages and write, commit, and abort commands. As SQLite does not automatically
their OOB area would be read one by one (Step 2). For pages that generate unique transaction IDs for each transaction, the transaction
do not have transaction information in the OOB area, NoLgn-FTL can IDs are generated in the kernel after each transaction is committed.
directly recover the L2P table of PPNs based on the LPNs in their OOB Upon receiving the written information from SQLite, the kernel first
area. Otherwise, NoLgn-FTL decides to recover old-version pages or assigns flags to the requested transaction pages. This enables the kernel
not according to transaction information. NoLgn-FTL would first obtain to keep track of the transaction status and perform necessary operations
pages with the same transaction ID. If the page with the end flag bit accordingly. Approximately 150 lines of code were modified in SQLite,
can be found, these pages would be directly put into the L2P table around 100 lines in the file system, and about 300 lines in FEMU.
together with their LPNs (Step 3). Otherwise, if all pages have the flag Hereafter, NoLgn-FTL will refer to the entire SQLite-Ext4-SSD sys-
bit 0, which indicates that the current transaction is not committed, tem stack modified to ensure the seamless integration and functionality
the old-version pages would be first read out (Step 4), and only the L2P of NoLgn-FTL within the existing software and hardware stack. The
mappings of old-version pages would then be put into the L2P table. newly introduced commands, which are based on the ioctl system
call, are as follows.
3.6. Discussion and overhead analysis write(page p, tid t, flag f). This command adds a transaction ID (tid),
𝑡, and a transaction flag, 𝑓 , to the original write operation. It is the
Compared to existing logging methods that store extra logs for each beginning of a transaction and corresponds to Step 4 in Fig. 4. The
transaction, the use of NoLgn-FTL allows normal data updates without inclusion of the transaction ID and flag enables the FTL to track and
the need for additional logging. The overhead of NoLgn-FTL is due manage the transaction.
to the storage of extra metadata, including the P2P table, transaction commit (tid t). This command with the parameter of transaction ID
information, and the block lock state. tid t is sent to NoLgn-FTL along with the original fsync command in the
5
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
Fig. 5. Recovery with NoLgn-FTL.
Linux kernel. It indicates the successful completion of a transaction and shows the normalized number of writes in flash memory compared to
aligns with Step 5 in Fig. 4. Upon receiving this command, NoLgn-FTL Base-WAL under two synchronization modes. In NORMAL mode, SW-
finalizes the transaction and ensures the durability of the associated WAL reduces writes by 35% compared to Base-WAL, as it eliminates
data. extra writes caused by out-of-place updates through WAL file remap-
abort(tid t). This command is invoked to terminate ongoing trans- ping. On average, NoLgn-FTL reduces 55% and 20% of the flash page
actions before committing transaction 𝑡. It indicates a rollback opera- writes compared to Base-WAL and SW-WAL, respectively. The superior
tion, reverting the data pages to their previous versions, akin to the performance of NoLgn-FTL is due to its elimination of WAL writes
data recovery process for uncommitted transactions as mentioned in and WAL synchronization, resulting in a greater reduction of writes
Section 3.5. compared to SW-WAL. Specifically, there are two reasons for NoLgn-
We compare NoLgn-FTL with Base-WAL, the original SQLite, which FTLs write reduction. First, as WAL has to write an extra log header,
uses the native logging scheme, and SW-WAL [4], which reduces WAL write involves more data than normal data write. Second, since
duplicate writes by SSD remapping as shown in Fig. 1a. For each trans-
synchronization does not happen immediately after each transaction,
action size, the database runs separately, but these transactions share
in NORMAL mode, updates onto the same page are serviced from
the same SSD storage. It is important to consider that in real-world
the cache. NoLgn-FTL combines several updates into a single update,
scenarios, particularly in mobile environments, the characteristics of
thereby reducing writes. However, this combination cannot be realized
write requests can significantly impact the performance of storage
in SW-WAL as it uses different LPNs for data updates and WAL writes.
systems. SQLite is a lightweight, embedded database commonly used in
mobile devices for local data storage, making it highly relevant to our In FULL mode, NoLgn-FTL reduces flash page writes by 35% and
analysis. Studies have shown that approximately 90% of write requests 2% compared to Base-WAL and SW-WAL, respectively. Both methods
in Android applications, such as Facebook and Twitter, are related to show reductions in page writes compared with Base-WAL, similar to the
SQLite databases and journal files. In environments like these, the data NORMAL mode. However, the enhancement brought by NoLgn-FTL is
items stored in the database are typically small, often below 4 KB. less than that of the NORMAL mode. As each transaction is forcibly
These small data items, such as individual records or keyvalue pairs, synchronized to flash memory after committing, there is no chance for
are frequently written to the storage medium in the form of random NoLgn-FTL to combine updates on the same page. The reduction from
write operations. These operations usually target data blocks ranging log header writes is limited. Thus, in this mode, NoLgn-FTL behaves
from 64B to 4 KB, and such small writes often involve high interaction similarly to SW-WAL.
with the underlying file system, such as EXT4, which is commonly
used in Android devices [17,18]. Therefore, we set different transaction 4.3. Results of database performance
sizes from 256B to 16 KB in the experiment to observe their impact on
performance. We used sqlite-bench to observe SQLite performance. Fig. 7 shows
We conduct experiments in both the FULL and NORMAL syn- the normalized throughput results of SQLite under the three com-
chronous modes of the database. In FULL mode, synchronization is pared methods. In NORMAL mode, NoLgn-FTL achieves an average
triggered after each transaction is committed. This forces all transaction performance improvement of 51% and 15% against Base-WAL and SW-
data to be written into SSDs, thus providing the highest atomicity
WAL, respectively. NoLgn-FTL performs particularly better compared
and durability. Conversely, in NORMAL mode, synchronization is not
to SW-WAL for small-sized transactions, due to the reasons described
triggered immediately after the transaction is committed. Typically,
earlier.
transactions are synchronized into SSDs only when a certain number
In FULL mode, we observe that NoLgn-FTL outperforms Base-WAL
of frames (including transaction heads and data) are accumulated.
and SW-WAL by an average of 26% and 4%, respectively. This perfor-
Note that NoLgn-FTL has no explicit WAL synchronization operation.
In NORMAL mode, we manually control the frequency of commit in mance improvement is primarily due to the reduction in the number
NoLgn-FTL to keep consistent with the synchronization operation of the of writes achieved by NoLgn-FTL. Meanwhile, we find that both SW-
other two existing methods. In NoLgn-FTL, a synchronization operation WAL and NoLgn-FTL demonstrate a gradual performance improvement
will be triggered every 1000 data pages. as the transaction size increases. This is because, for large-size trans-
actions, Base-WAL takes up more latency to write flash pages and GC.
4.2. Results of flash page writes Since SW-WAL and NoLgn-FTL reduce the number of data writes, this
degradation is mitigated. Even in this situation, the performance of
We used sqlite-bench with 200 thousand overwrite operations to SW-WAL is still inferior to that of NoLgn-FTL, as it maintains head
observe the effect of NoLgn-FTL on flash memory page writes. Fig. 6 information that consumes data write latency.
6
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
Fig. 6. Results of flash page writes.
Fig. 7. SQLite database performance.
Fig. 8. SQLite database latency.
Besides, we also evaluated database latency data under different by NoLgn-FTL remains significant. Compared to Base-WAL, NoLgn-FTL
conditions. Fig. 8 illustrates the normalized latency results under the reduces latency by an average of 16.4%, and compared to SW-WAL,
three compared methods: Base-WAL, SW-WAL, and NoLgn-FTL, in both the reduction is 3.7%. Both NoLgn-FTL and SW-WAL exhibit a gradual
NORMAL and FULL modes. latency improvement as transaction size increases, which aligns with
In NORMAL mode, NoLgn-FTL demonstrates the lowest latency the behavior observed in throughput analysis. For larger transactions
among the three methods, achieving an average reduction of 34.4% (e.g., 8 KB and 16 KB), Base-WAL experiences higher latency due to
compared to Base-WAL and 11% compared to SW-WAL. The latency more extensive flash page writes and garbage collection overhead. In
advantage of NoLgn-FTL is particularly pronounced for small-sized contrast, NoLgn-FTL and SW-WAL effectively mitigate this degradation
transactions (e.g., 256B and 512B). This stems from its ability to by reducing the volume of writes.
reduce the number of writes and optimize metadata updates, minimiz-
ing the overhead typically associated with WAL. SW-WAL also shows 4.4. Results of GC overhead
improved latency compared to Base-WAL, with an average reduction
of approximately 26.2%, thanks to its selective write strategy. How- We used sqlite-bench to investigate the impact of block locking on
ever, its performance is still limited due to the additional overhead GC performance by collecting write distribution results under different
introduced by writing WAL, which becomes increasingly noticeable for transaction sizes. Fig. 9 shows the write distribution of host requests,
smaller transactions. In FULL mode, the latency reduction achieved GC migration, and block locking (denoted as additional pages) under
7
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
Fig. 9. Results of GC overhead. NoLgn-FTL would lock certain blocks, which would affect victim block selection and induce more migrations.
Table 1 E), the improvements from both methods are not significant. This is
YCSB workloads.
mainly because both methods only enhance write performance and
Workload Description have little impact on read performance. Meanwhile, NoLgn-FTL still
A 50% read and 50% update, Zipfian distribution outperforms SW-WAL due to its greater write performance benefits. In
B 95% read and 5% update, Zipfian distribution
the case of workload C, which only contains read requests, there are
C 100% read, Zipfian distribution
D 95% read and 5% insert, latest read no obvious differences in the three methods. This is because the remap-
E 95% scan and 5% insert, Zipfian distribution based logging in SW-WAL and no-logging scheme in NoLgn-FTL are not
F 50% read and 50% readmodifywrite, Zipfian distribution triggered. The slight performance fluctuations arise from the random
nature of read operations.
Fig. 11 shows the performance of SQLite in terms of transactions
different transaction sizes. per minute (tpmC) with different SSD free spaces. To obtain SSDs
Two key observations can be made from Fig. 9. First, as transaction with varying free space, sufficient random overwrite iterations are
value size increases, the proportion of valid page migration involved performed before each of the experiments. TPC-C is a write-intensive
in GC also increases, reaching a maximum of 62%. This trend can be workload with operations such as new orders, payment, and delivery,
attributed to the fact that larger transaction sizes require more frequent with an average of two pages updated per transaction. The results
GC to accommodate new content. Second, the block locking mechanism show that when SSD free space is 75%, the performance differences
impacts the number of valid pages migrated. The maximum proportion among the three modes are relatively small. However, as SSD free
of additional migration pages due to block locking is 6%, with an space decreases, the performance gap widens. Overall, NoLgn-FTL sig-
average increase of 3.5% in total write pages. This impact is more nificantly outperforms Base-WAL and SW-WAL. On average, SW-WAL
significant for smaller transaction sizes, as updates may be concentrated improves transaction throughput by 20% compared to Base-WAL, while
in fewer blocks, preventing them from being chosen as optimal victim NoLgn-FTL improves throughput by 38%. Notably, the performance
blocks for GC and leading to suboptimal data migration with more valid gains of SW-WAL and NoLgn-FTL become more pronounced when SSD
pages. free space is limited. When SSD remaining space is 25%, NoLgn-FTLs
Despite the extra page writes caused by block locking, these over- throughput is 81% higher than Base-WAL. This is mainly because when
heads are acceptable compared to the significant reduction in duplicate SSD free space is low, there may be a lack of free blocks, requiring
writes achieved by NoLgn-FTL. The benefits of eliminating duplicate frequent GC to accommodate new writes. Additionally, TPC-Cs trans-
writes and improving overall write performance outweigh the relatively action data size is relatively small, allowing multiple data items to be
minor increase in valid page migrations caused by locking SSD blocks. stored in a single page. Therefore, NoLgn-FTL effectively reduces write
operations and GC needs by minimizing duplicated writes.
4.5. Results of YCSB and TPC-C performance
5. Related works
We also evaluate NoLgn-FTL using the YCSB benchmark to assess its
performance under various realistic workloads. YCSB provides six core Research addressing duplicate writes can be divided into two direc-
workloads as summarized in Table 1. To evaluate the long-term impact tions: optimization on atomic writes and remapping-based methods.
of NoLgn-FTL, we use TPC-C benchmarks with four 4 warehouses [19] An atomic write interface was initially proposed by Park et al. [20],
tested under different SSD free space conditions. TPC-C contains the which achieved atomicity for multi-page writes. Prabhakaran et al. [21]
following 5 transaction types: 43% new order, 43% payment, 4% further introduced a transactional FTL called txFlash, which provides
delivery, 4% order status, 4% stock level. The number of database a transaction interface (WriteAtomic) to higher-level software. It pro-
connections was set to 1 to avoid frequent aborts of update transactions. vides isolation among multiple atomic write calls by ensuring that
Fig. 10 shows the normalized throughput results of SQLite under no conflicting writes are issued. Xu et al. [22] used the native off-
YCSB benchmarks in NORMAL mode. On average, SW-WAL shows site update feature of NAND flash memory to simulate copy-on-write
a 10% performance improvement over Base-WAL, while NoLgn-FTL technology and, at the same time, used NVM to store the FTL mapping
achieves a 17% improvement. For write-intensive workloads (A and F), table. However, these methods mostly supported atomicity for multi-
both SW-WAL and NoLgn-FTL exhibit significantly better performance page writes only. Kang et al. presented X-FTL [23], aiming to support
than Base-WAL. However, for read-intensive workloads (B, D, and general transactional atomicity, allowing data pages in a transaction
8
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
Fig. 10. SQLite performance on YCSB benchmarks.
Fig. 11. SQLite performance on TPC-C benchmark.
to be written to flash at any time. However, it requires an additional 6. Conclusion
X-L2P table and needs to persist it to flash upon transaction commit.
Address remapping is another extensively researched method that In this paper, we presented NoLgn-FTL to directly update the
modifies the mapping table directly without performing actual writing. database in a no-logging way by reusing the old flash pages. NoLgn-
Wu et al. [24] proposed KVSSD, which exploits the FTL mapping mech- FTL uses a P2P table and OOB area of flash pages to keep old page
anism to implement copy-free compaction of LSM trees, and it enables information and transaction information. Thus, systems can recover
direct data allocation in flash memory for efficient garbage collection. to a consistent state when a crash happens. As there is no need to
However, address remapping may suffer from mapping inconsistencies store logging files in NoLgn-FTL, duplicate writes can be avoided. We
due to the inability of flash memory to perform in-place updates. implemented a prototype of NoLgn-FTL on the FEMU SSD simulator
Hahn et al. [25] use the address remapping operation for file system and integrated it with the SQLite database. The file system is modified
defragmentation. However, after remapping, it uses file system logs to to enable SQLite to use the provided interface and transfer transaction
deal with mapping inconsistencies. The larger log size results in longer information. Experimental results demonstrate that NoLgn-FTL can
search times and increased memory consumption when performing significantly reduce writes to SSDs and improve the performance of
read operations. As the number of remappings escalates, the log can SQLite, while still ensuring atomicity.
become several hundred MB or even GB. Therefore, these methods
may incur significant lookup overhead. Zhou et al. [26] address this CRediT authorship contribution statement
issue by storing the new mapping table in Non-Volatile Memory, re-
ducing lookup overhead. Besides, Wu et al. [4] proposed SW-WAL, a Zhenghao Yin: Writing original draft, Visualization, Validation,
novel approach that emulates the maintenance of a mapping table by Software, Methodology, Investigation, Formal analysis, Data curation.
inscribing transaction information directly into the OOB area of flash Yajuan Du: Writing review & editing, Supervision, Project adminis-
pages. This strategy markedly reduces the footprint of the search table tration, Conceptualization. Yi Fan: Visualization. Sam H. Noh: Writing
and concurrently boosts search efficiency. Additionally, to deal with review & editing.
the heavy query latency during WAL checkpointing, Yoon et al. [27]
proposed Check-In to align journal logs to the FTL mapping unit. Funding
The FTL creates a checkpoint by remapping the journal logs to the
checkpoint, effectively reducing the checkpointing overhead and WALs This research did not receive any specific grant from funding agen-
duplicate writes. cies in the public, commercial, or not-for-profit sectors.
9
Z. Yin et al. Journal of Systems Architecture 160 (2025) 103347
Declaration of competing interest [23] W.-H. Kang, S.-W. Lee, B. Moon, G.-H. Oh, C. Min, X-FTL: transactional FTL
for SQLite databases, in: Proceedings of the 2013 ACM SIGMOD International
Conference on Management of Data, 2013, pp. 97108.
The authors declare that they have no known competing finan-
[24] S.-M. Wu, K.-H. Lin, L.-P. Chang, KVSSD: Close integration of LSM trees and
cial interests or personal relationships that could have appeared to flash translation layer for write-efficient KV store, in: 2018 Design, Automation
influence the work reported in this paper. & Test in Europe Conference & Exhibition, DATE, IEEE, 2018, pp. 563568.
[25] S.S. Hahn, S. Lee, C. Ji, L. Chang, I. Yee, L. Shi, C.J. Xue, J. Kim, Improving file
system performance of mobile storage systems using a decoupled defragmenter,
Data availability
in: 2017 USENIX Annual Technical Conference (USENIX ATC 17), 2017, pp.
759771.
The original contributions presented in the study are included in the [26] Y. Zhou, Q. Wu, F. Wu, H. Jiang, J. Zhou, C. Xie, Remap-SSD: Safely and
article, further inquiries can be directed to the corresponding author. efficiently exploiting SSD address remapping to eliminate duplicate writes, in:
19th USENIX Conference on File and Storage Technologies (FAST 21), 2021, pp.
187202.
[27] J. Yoon, W.S. Jeong, W.W. Ro, Check-In: In-storage checkpointing for key-
References
value store system leveraging flash-based SSDs, in: 2020 ACM/IEEE 47th Annual
International Symposium on Computer Architecture, ISCA, 2020, pp. 693706,
[1] C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, P. Schwarz, ARIES: A transaction http://dx.doi.org/10.1109/ISCA45697.2020.00063.
recovery method supporting fine-granularity locking and partial rollbacks using
write-ahead logging, ACM Trans. Database Syst. 17 (1) (1992) 94162.
[2] S. Lee, D. Park, T. Chung, D. Lee, S. Park, H. Song, A log buffer-based flash Zhenghao Yin received the BS degree in Computer Science
translation layer using fully-associative sector translation, ACM Trans. Embed. from Wuhan University of Technology, Wuhan, China, in
Comput. Syst. ( TECS) 6 (3) (2007) 18es. 2022, and is currently pursuing the MS degree in Computer
[3] L. Shi, J. Li, C.J. Xue, C. Yang, X. Zhou, ExLRU: A unified write buffer cache Science, expected to graduate in 2025. His research interests
management for flash memory, in: Proceedings of the Ninth ACM International include flash memory and database technologies.
Conference on Embedded Software, 2011, pp. 339348.
[4] Q. Wu, Y. Zhou, F. Wu, K. Wang, H. Lv, J. Wan, C. Xie, SW-WAL: Leveraging
address remapping of SSDs to achieve single-write write-ahead logging, in: 2021
Design, Automation & Test in Europe Conference & Exhibition, DATE, 2021, pp.
802807.
[5] F. Ni, X. Wu, W. Li, L. Wang, S. Jiang, Leveraging SSDs flexible address mapping
to accelerate data copy operations, in: 2019 IEEE 21st International Conference Yajuan Du received the joint Ph.D. degrees from the City
on High Performance Computing and Communications; IEEE 17th International University of Hong Kong and the Huazhong University of
Conference on Smart City; IEEE 5th International Conference on Data Science Science and Technology, in December 2017 and February
and Systems (HPCC/SmartCity/DSS), 2019, pp. 10511059. 2018, respectively. She is currently an Assistant Professor
[6] J. Coburn, T. Bunker, M. Schwarz, R. Gupta, S. Swanson, From ARIES to MARS: with the School of Computer Science and Technology,
Transaction support for next-generation, solid-state drives, in: Proceedings of Wuhan University of Technology. Her research interests
the Twenty-Fourth ACM Symposium on Operating Systems Principles, 2013, pp. include optimizing access performance, data reliability, and
197212. persistency of flash memories and non-volatile memories.
[7] J. Arulraj, M. Perron, A. Pavlo, Write-behind logging, Proc. VLDB Endow. 10 (4)
(2016) 337348.
[8] K. Han, H. Kim, D. Shin, WAL-SSD: Address remapping-based write-ahead-logging
solid-state disks, IEEE Trans. Comput. 69 (2) (2019) 260273.
[9] G. Oh, C. Seo, R. Mayuram, Y.-S. Kee, S.-W. Lee, SHARE interface in flash storage
for relational and NoSQL databases, in: Proceedings of the 2016 International Yi Fan received the BS degree in Computer Science from
Conference on Management of Data, 2016, pp. 343354. Wuhan University of Technology, Wuhan, China, in 2022,
[10] Q. Wu, Y. Zhou, F. Wu, H. Jiang, J. Zhou, C. Xie, Understanding and exploiting and is currently pursuing the MS degree in Computer
the full potential of SSD address remapping, IEEE Trans. Comput.-Aided Des. Science, expected to graduate in 2025. His research interests
Integr. Circuits Syst. 41 (11) (2022) 51125125. include keyvalue databases and flash memory technologies.
[11] H. Li, M. Hao, M.H. Tong, S. Sundararaman, M. Bjørling, H.S. Gunawi, The
CASE of FEMU: Cheap, accurate, scalable and extensible flash emulator, in:
16th USENIX Conference on File and Storage Technologies (FAST 18), 2018,
pp. 8390.
[12] Y. Zhou, F. Wu, Z. Lu, X. He, P. Huang, C. Xie, SCORE: A novel scheme to
efficiently cache overlong ECCs in NAND flash memory, ACM Trans. Archit.
Code Optim. ( TACO) 15 (4) (2018) 125.
Sam H. (Hyuk) Noh received his BE in Computer Engineer-
[13] L. Long, S. He, J. Shen, R. Liu, Z. Tan, C. Gao, D. Liu, K. Zhong, Y. Jiang, WA-
ing from Seoul National University in 1986 and his Ph.D. in
Zone: Wear-aware zone management optimization for LSM-Tree on ZNS SSDs,
Computer Science from the University of Maryland in 1993.
ACM Trans. Archit. Code Optim. 21 (1) (2024) 123.
He held a visiting faculty position at George Washington
[14] D. Huang, D. Feng, Q. Liu, B. Ding, W. Zhao, X. Wei, W. Tong, SplitZNS: Towards
University (19931994) before joining Hongik University,
an efficient LSM-tree on zoned namespace SSDs, ACM Trans. Archit. Code Optim.
where he was a professor in the School of Computer and
20 (3) (2023) 126.
Information Engineering until 2015. From 2001 to 2002, he
[15] S.-H. Kim, J. Shim, E. Lee, S. Jeong, I. Kang, J.-S. Kim, NVMeVirt: A versatile
was a visiting associate professor at UM IACS, University of
software-defined virtual NVMe device, in: 21st USENIX Conference on File and
Maryland. In 2015, Dr. Noh joined UNIST as a professor
Storage Technologies (FAST 23), 2023, pp. 379394.
in the Department of Computer Science and Engineering.
[16] B.S. Kim, J. Choi, S.L. Min, Design tradeoffs for SSD reliability, in: 17th USENIX
He became the inaugural Dean of the Graduate School
Conference on File and Storage Technologies (FAST 19), 2019, pp. 281294.
of Artificial Intelligence and previously served as Dean of
[17] Z. Shen, Y. Shi, Z. Shao, Y. Guan, An efficient LSM-tree-based sqlite-like database
the School of Electrical and Computer Engineering (2016
engine for mobile devices, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.
2018). He has contributed to numerous conferences, serving
38 (9) (2018) 16351647.
as General Chair, Program Chair, or committee member
[18] A. Mäkinen, Tracing Android applications for file system optimization.
for events like ACM SOSP, USENIX FAST, ACM ASPLOS,
[19] S.T. Leutenegger, D. Dias, A modeling study of the TPC-C benchmark, ACM
and USENIX OSDI. He also chaired the ACM HotStorage
Sigmod Rec. 22 (2) (1993) 2231.
Steering Committee and serves on the Steering Committees
[20] S. Park, J.H. Yu, S.Y. Ohm, Atomic write FTL for robust flash file system, in:
for USENIX FAST and IEEE NVMSA. Dr. Noh was Editor-
Proceedings of the Ninth International Symposium on Consumer Electronics,
in-Chief of ACM Transactions on Storage (20162022) and
2005.(ISCE 2005), 2005, pp. 155160.
is now co-Editor-in-Chief of ACM Transactions on Computer
[21] V. Prabhakaran, T.L. Rodeheffer, L. Zhou, Transactional flash, in: OSDI, Vol. 8,
Systems. His research focuses on system software and storage
2008.
systems, emphasizing emerging memory technologies like
[22] Y. Xu, Z. Hou, NVM-assisted non-redundant logging for Android systems, in:
flash and persistent memory.
2016 IEEE Trustcom/BigDataSE/ISPA, 2016, pp. 14271433.
10