Persistent Data Structure

description58 papers

group1 follower

lightbulbAbout this topic

A persistent data structure is a data structure that maintains its previous versions when modified, allowing access to both the current and historical states. This characteristic enables efficient version control and facilitates functional programming paradigms, where immutability and state preservation are essential.

lightbulbAbout this topic

Key research themes

1. How can persistent memory technology improve the design and performance of persistent data structures?

This theme investigates leveraging byte-addressable persistent memory (PM) technologies—such as phase change memory (PCM) and Intel Optane DC Persistent Memory—to redesign data structures and storage systems. It explores novel mechanisms that harness PM's low latency, persistence, and byte-addressability to achieve atomicity, strong consistency, durability, and scalability in persistent data structures, file systems, hash tables, and synchronization protocols. Research evaluates new algorithms and hardware-software co-designs that minimize overhead from persistence operations and improve recovery after crashes.

Better I/O Through Byte-Addressable, Persistent Memory

by Dawie Burger

2022

Key finding: Introduces BPFS, a file system optimized for byte-addressable persistent memory, using short-circuit shadow paging and atomic 8-byte writes to achieve fine-grained atomic updates. Evaluation on DRAM and simulated PCM shows... Read more

articleView Paper downloadDownload

ESH: Design and Implementation of an Optimal Hashing Scheme for Persistent Memory

by Junseok Hwang

2025, Applied Sciences

Key finding: Proposes ESH, a scalable hashing scheme tailored for persistent memory that improves load factor, memory utilization, and scalability by redistributing overflow records within hash table segments to delay costly full-table... Read more

articleView Paper downloadDownload

The performance power of software combining in persistence

by Panagiota Fatourou

2025, Zenodo (CERN European Organization for Nuclear Research)

Key finding: Presents novel recoverable software combining protocols (PBcomb and PWFcomb) that reduce persistence overhead by minimizing expensive persistence instructions and contention. These protocols achieve strong recoverability and... Read more

articleView Paper downloadDownload

Software Hint-Driven Data Management for Hybrid Memory in Mobile Systems

by Paul Gratz

2024, ACM Transactions on Embedded Computing Systems

Key finding: Develops a hardware-assisted memory management unit (HMMU) combined with software-provided hints to optimize data placement in hybrid memory systems combining DRAM and emerging NVMs like PCM. The approach overcomes hardware... Read more

articleView Paper downloadDownload

Persistent objects in the Fleet system

by Daniela Tulone

2024, Proceedings DARPA Information Survivability Conference and Exposition II. DISCEX'01

Key finding: Describes Fleet, a middleware system implementing persistent Java objects replicated over distributed servers with Byzantine fault tolerance. Fleet provides linearizable concurrent semantics, liveness guarantees under benign... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

2. What methods enable efficient incremental maintenance and querying of graph-structured persistent data?

This theme addresses the challenges of managing persistent graph-structured data, including how to define materialized and virtual views, maintain these views incrementally after base data modifications, and extract meaningful graph-theoretical features for persistent analysis. Research explores generalizations of views and persistence beyond relational models, rank-based and indexing-aware persistence functions for graphs and directed graphs, facilitating efficient updates and queries on complex, linked data structures common in modern applications like social networks and Web data.

Exploring Graph and Digraph Persistence

by Mattia G Bergomi

2024

Key finding: Introduces rank-function-based and indexing-aware persistence functions tailored for graphs and directed graphs, avoiding reliance on traditional simplicial homology constructions. Defines 'simple' and 'single-vertex'... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

3. How can persistent object-oriented models effectively manage large volumes of structured data with efficient storage and querying, especially when using programming languages like C++?

This theme investigates the design and implementation of persistent object-oriented data models that divide object representation between memory and disk to handle large, interconnected datasets. It examines approaches to partition objects into transient identifiers in RAM and bulk data stored separately, enabling efficient queries without reading entire objects. Research focuses on native query capabilities embedded in languages like C++, navigational data access, and file naming schemes to optimize storage utilization, query speed, and manageable memory footprints for complex scientific and engineering applications.

On data storage and searching in persistent object-oriented models in C++

by Alexander Kozynchenko

2024, Research Square (Research Square)

Key finding: Proposes partitioning persistent objects into transient parts with object identifiers in RAM and bulk data stored on disk in files named uniquely by object ID, enabling rapid initial searches via file names and deferred deep... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

All papers in Persistent Data Structure

An aspect-oriented implementation of the EJB3.0 persistence concept

by André Martins

2026, Proceedings of the 6th workshop on Aspects, components, and patterns for infrastructure software

This paper demonstrates the power of aspect-orientation by implementing the EJB3.0 persistence framework. Our approach has advantages over existing mapping tools: Flexibility is higher as the functionality can be freely implemented and... more

descriptionView Paper arrow_downwardDownload

Designing High-Reliability Enterprise Java Systems Through Modular Architecture and Resilience Patterns

by Sriram Ghanta

2026, International Journal of Scientific Research in Science and Technology

Enterprise software systems increasingly demand high reliability, scalability, and adaptability as they operate in environments characterized by rapidly evolving business requirements, heterogeneous integrations, and unpredictable runtime... more

descriptionView Paper arrow_downwardDownload

Engineering Highly Reliable and Transaction-Safe Data Processing Frameworks Using JPA and Hibernate for Scalable Enterprise Application Systems

by Sriram Ghanta

2026, International Journal of Scientific Research in Science and Technology

Enterprise application ecosystems increasingly depend on object relational mapping frameworks to manage complex transactional workloads while maintaining data integrity, consistency, and scalability across distributed environments. Persistent challenges remain in balancing transactional safety with performance efficiency, particularly when applications evolve beyond monolithic architectures into layered and service-oriented systems. The objective of this research is to examine how Java Persistence API and Hibernate can be systematically engineered to deliver highly reliable and transaction safe data processing frameworks suited for large scale enterprise application systems. The study addresses limitations in conventional persistence strategies by analyzing transactional behavior, isolation control, persistence context management, and failure recovery mechanisms. A mixed methodological approach is adopted, combining quantitative performance evaluation under controlled transactional workloads with qualitative architectural analysis drawn from enterprise deployment patterns. Empirical findings demonstrate that carefully configured transaction boundaries, optimized persistence context handling, and disciplined use of concurrency controls significantly reduce rollback frequency while sustaining throughput under scale. The work introduces an integrated framework that aligns transactional semantics with application-level consistency requirements, offering a practical bridge between academic theory and industrial implementation. Strategic contributions include guidance for architects and engineering leaders seeking predictable transactional behavior in high demand systems, as well as a structured reference model for future empirical studies in enterprise persistence engineering. The findings reinforce the role of disciplined transaction management as a foundational capability for reliable enterprise systems and provide actionable insights with lasting relevance for both academic research and industry practice.

descriptionView Paper arrow_downwardDownload

Observations on Porting In-memory KV stores to Persistent Memory

by Parv Saxena

2025, ArXiv

Systems that require high-throughput and fault tolerance, such as key-value stores and databases, are looking to persistent memory to combine the performance of in-memory systems with the data-consistent fault-tolerance of nonvolatile... more

descriptionView Paper arrow_downwardDownload

Implementing Partial Persistence in Object-Oriented Languages

by Roel Wuyts

2025, 2008 Proceedings of the Tenth Workshop on Algorithm Engineering and Experiments (ALENEX)

A partially persistent data structure is a data structure which preserves previous versions of itself when it is modified. General theoretical schemes are known (e.g. the fat node method ) for making any data structure partially... more

Figure 1: Structure to save versions of a field.

Figure 5: Number of updates+reads vs. average time per read.

Figure 2: Number of insertions vs. average time per insertion

Figure 7: Search in non persistent and persistent treaps: number of elements in treap vs. average time per search.

Figure 3: Number of elements in the structure vs average time per search

Figure 4: Number of updates vs. average time per update.

Figure 8: Planar Point Location: number of points in the plane vs. the time to locate a random point Figure 9: Sizes for object with 1, 2, 3 and 4 fields: number of update followed by snapshot vs. the size of the object

Figure 6: Insertion in non persistent and persistent treaps: number of insertions vs. average time per insertion.

descriptionView Paper arrow_downwardDownload

Persistent memory and orthogonal persistence: a persistent heap design and its implementation for the Java virtual machine

by Taciano Dreckmann Perez

2024

for their contributions to the implementation of JaphaVM. It was great belonging to this team. I owe a great debt to Pedro Garcez Monteiro, for his valuable contributions to JaphaVM, first at the LIS lab, and later as an HP colleague.... more

Table 5.5 — |/O counters for Traversal 1 on tiny DB. Table 5.6 — 1/O counters for Traversal 1 on smal/ DB.

Listing A.23 - ARRAY STORE macro (interp/engine/interp.c)

Table 5.7 — 1/O counters for Traversal 1 on medium DB.

Table 3.2 — Summary of Mnemosyne’s programming interface. Extracted from [82] application code can also make use of it for their own purposes. executes, the value of the integer variable flag will switch between 0 and 1. Listing 3.1 — Mnemosyne Example Using Static Variables

to a new chunk with the given size, erases the old data and set the old chunk as free. The Hash table initialization code was changed to avoid resetting all entries during JVM startup.

Figure 5.1 — Comparison of execution times for OO7 traversals of different db sizes (secs, log scale)

The relational database scenarios (PostgreSQL and H2) consistently take the longest time t xecute traversal benchmarks. This is explained by the fact that these scenarios require ORN ranslation and access to storage, resulting in larger amount of data movement, which can b ybserved on Tables 5.5—5.7. Data movement is proportional to payload; the larger the databas« nore data movement is observed. PostgreSQL and H2 have very similar execution times, despit he former storing data on disk and the latter in memory. This happens because the experimentz ystem has enough memory to keep the PostreSQL database fully cached in memory. The trend fc ess context switches on JaphaVM is still observable, for the same reasons presented in the databas waatinn ana lwere

The ORM layer used on the relational database scenarios creates many additional short-lived objects, and as consequence, the JVM spends more time doing Garbage Collection (GC). For the medium database size, all objects created by JaphaVM fit into the 8GB heap, not requiring GC; while the relational database scenarios required between 56—68 GCs (15—16 of them also performing heap compactions), which accounted for 15.62%—17.31% of the overall database creation time. We also observe that JaphaVM execution has less context switches. This happens because it performs significantly less storage access, and thus is less blocked by |/O waits, which are a common cause for voluntary context switches.

The Managed Data Structures (MDS) library [34], created by Hewlett-Packard Enterprise, pro- vides a high-level programming model for persistent memory. It is designed to take advantage of large, random-access, non-volatile memory and highly-parallel processing. Application programmers use and persist their data directly in their application, in common data structures such as lists, maps and sraphs, and the MDS library manages this data. The library supports multi-threaded, multi-process creation, use and sharing of managed data structures, via APls in multiple programming languages, Java and C++.

(d) Comparison of execution times for traversal 2c

iterate over the elements. When a programmer creates a managed data structure, like this ManagedArray, MDS allocate: this object directly in the MDS Managed Space. This Managed Space is one large virtual pool o memory; as with the hel Managed Sp hared, persistent heap in which MDS allocates and manages MDS application objects p of the Multi-Process Garbage Collector. MDS objects are allocated in the MDS ace using create method calls on an object type; in contrast with a programmer calling the new method to create a standard Java object in the Java single-process volatile heap. All MDS objects are strongly typed. Managed types support a createArray() method to create < managed array of that type: for example. ManagedInt. TYPE (the managed type for integers) has the new method to create a standard Java object in the Java single-process volatile heap.

Table 5.1 — Properties of OO7 database size presets.

Table 5.12 — Development complexity for OO7 scenarios. We have compared both OO7 and Lucene baselines against their OP versions. Three simple complexity metrics were collected for each implementation: logical lines of code (LOC), number of classes and number of methods. The OO7 results are listed on Table 5.12 and Lucene’s on Table 5.13.

Table 5.9 — Lucene Indexing I/O counters. Table 5.10 — Lucene single-term query |/O counters.

The DLL hash table is not stored, and thus needs to be recreated at each JVM execution.

Table 3.1 — Comparison of memory/storage technologies. Phase-Change Random Access Memory (also called PCRAM, PRAM or PCM) is currently the most mature of the new memory technologies under research. It relies on some materials, called phase- change materials, that exist in two different phases with distinct properties: an amorphous phase, characterized by high electrical resistivity, and a crystalline phase, characterized by low electrical resistivity [75]. These two phases can be repeatedly and rapidly cycled by applying heat to the material [18, 75].

SoftPM’s implementation consists of two main components: the Location Independent Memon locator (LIMA), and the Storage Optimized 1/O Driver (SID). LIMA manages the container’: ersistent data as a collection of memory pages marked for persistence. When creating a persistence joint, LIMA is responsible for identifying the graph of data structures referenced by the containe nd mark the ones which were modified, in order to be persisted. SID atomically commits containe ata to persistent storage, which can be disks, flash drives, network, or memcachedb. SoftPM’: rchitecture is depicted on Figure 2.2.

A struct called OPC (Orthogonal Persistence Context), shown previously on Listing A.12, contains

Table 2.1 — SoftPM API. Extracted from [40]. tence root, is allocated using the pCAlloc function, but its contents are not made persistent at this point. Whenever the pPoint function is called, it creates a persistence point, i.e., all data reachable from the container will be made persistent. In subsequent executions of the program, the pCRestore function returns a pointer to a container previously created. Code Listing 2.6 describes the implementation of a persistent list. fCAlloc allocates a container and pPoint makes it narcictant

In the next section, we evaluate development complexity metrics for JaphaVM compared to

Table 5.11 — Lucene double-term query |/O counters.

Listing A.32 — PUTSTATIC QUICK opcode (interp/engine/interp.c)

Proxy is a simple class which contains one instance attribute called aString. The class IndirectExample has one class attribute of the type Proxy called proxy. Every time the method main () is invoked, it concatenates "AB" to proxy.aString value (Line 11). As Java Strings are immutable, every concatenation generates a new String object in the heap. It is similar to the previous example, but now the references from a class attribute are indirect.

Allocate and initialize Java monitors hash table.

Listing 2.1 — PS-algol example of adding a record to a table Listing 2.2 — PS-algol example of retrieving a record from a table

Table 5.2 — |/O counters for tiny DB creation. Table 5.3 — 1/O counters for sma// DB creation.

Table 5.4 — |/O counters for medium DB creation.

The Methodblock implementation can be viewed in code listing A.4.

Listing A.7 — ExecutionEnvironment structure

descriptionView Paper arrow_downwardDownload

Orthogonal persistence in nonvolatile memory architectures: A persistent heap design and its implementation for a Java Virtual Machine

by Taciano Dreckmann Perez

2024, Software: Practice and Experience

SummaryCurrent computer systems separate main memory from storage, and programming languages typically reflect this distinction using different representations for data in memory and storage. However, moving data back and forth between... more

descriptionView Paper arrow_downwardDownload

Congeries, mapping and Grasshopper

by Maurice Ashton

2024

descriptionView Paper arrow_downwardDownload

Analysis of Different ORM Tools for Data Access Object Tier Generation: A Brief Study

by Kuldeep Hule

2024, International journal of membrane science and technology

The Data Access Tier, also known as the Data Access Layer (DAL), is a specific tier within the 3-tier architecture or other software architectural patterns. The interaction between the application's business logic and the underlying data... more

Alternatively, we can say MVC architecture as a 3-tier system design, which is a sort of software design where each element in MVC structure is composed as a Tier or layer of logical building blocks. By dividing up the user interface, business logic, and data storage levels, a three-tier structure has numerous advantages in software development and production. It can give development teams more freedom to change a certain element of the programme independently of other parts. The diagram below depicts a three-tier architecture: Object Relational Mapping is a method of integrating an object-oriented programming language with databases. It gives programmers the ability to construct mappings between tuples in database relations and objects in the programming language. The OR-Mapping technique will add one more layer between the business and data layers for mapping reasons, as indicated in the figure below.

Due to increase the middle layer between Business Logic Layer and Data Layer, application gets performance lacunas like required time for execution gets increased, extra dependencies gets involved, extra learning required to understand and use of the OR mapping tool.

Figure 6. Proposed Architecture of the Research Work Framework is the overall term for the idea of making mappings between tables, Stored strategies and their fields, and Object-Oriented classes and their fields to have the option to address at runtime an Entity Definition object as a table row in an Object-Oriented program by means of a class object as well as the other way around. Following diagram illustrate the proposed research methodology:

Figure 4. Typical Execution Durations for Various Programming Languages

Figure 1. MVC Architecture (Adapted from https:/Awww.guru99.com/mvc-tutorial.html). It has three elements: a model, which includes all the information and its related connections; a view, which presents information to the user controller as a mediator between the model and view elements [42].

1. PROPOSED RESEARCH METHODOLOGY International Journal of Membrane Science and Technology, 2023, Vol. 10, No. 1, pp 1277-1291

Shoaib Mahmood Bhatti, Zahid Hussain Abro, Farzana Rauf Abro et al. [29] presented a performance evaluation of three popular Java-based object-relational mapping (ORM) tools - Hibernate, Ebean, and TopLink. ORM tools are used to map object-oriented code to relational databases. The authors tested the performance of basic CRUD (create, read, update, delete) operations using these tools with a sample database. The results showed that overall, Ebean had the fastest execution times, especially for read queries with different comparison operators. Hibernate was fastest for insert operations. The authors recommend Ebean as the top performing ORM tool, followed by Hibernate. They suggest future work could evaluate more complex queries, other ORM tools, on newer hardware, and across operating systems. While Mikhail Gorodnichev[10] and his team explored object-relational mapping (ORM) systems which bridge the gap between object-oriented programming and relational databases. They were discussing the semantic differences between the two approaches that lead to the "impedance mismatch" problem. The authors evaluated the Entity Framework ORM system using a test database. Initial results showed a 37x slowdown with ORM versus direct SQL queries. As shown in figure:

Table 1. Performance statistics for SQL queries at Contoso As a result, they got various consequences of an ORM, like Eger retrieval of columns, unwanted nested queries, additional sorting when updating records, duplicate code when including columns in the application, and more slow execution both during aggregate and Execution time. They were not thinking about the costing issue, and they were thinking only about the Entity Framework ORM tool.

descriptionView Paper arrow_downwardDownload

Persistence In the Grasshopper Kernel

by Anders Lindstrom

2024, AUSTRALIAN …

The Grasshopper operating system provides explicit support for orthogonal persistence. A consequence of this is that the kernel itself must, in part, be persistent. To conform to the model of persistence in Grasshopper, the kernel... more

descriptionView Paper arrow_downwardDownload

KITTy: A PACKAGE FOR EXTERNAL PATCHES COMMUNICATION MANAGEMENT IN MAX/MSP–A PROGRESS REPORT

by paulo ferreira-lopes

2024, music.mcgill.ca

We present KITTy (Kit d'Interfaçage Tout Terrain), a package programmed in Max/MSP allowing users to design their own network of integrated external patches. This package provides persistence and state-storage mechanisms within a network... more

descriptionView Paper arrow_downwardDownload

Persistence software

by Arthur Keller

2024, Sigmod Record

Building object-oriented applications which access relational data introduces a number of technical issues for developers who are making the transition to C++. We describe these issues and discuss how we have addressed them in... more

descriptionView Paper arrow_downwardDownload

Tracking in Order to Recover: Detectable Recovery of Lock-Free Data Structures

by Ohad Ben-Baruch

2024, Zenodo (CERN European Organization for Nuclear Research)

This paper presents a generic approach for deriving detectably recoverable implementations of many widely-used concurrent data structures. Such implementations are appealing for emerging systems featuring byte-addressable non-volatile... more

descriptionView Paper arrow_downwardDownload

Upper and Lower Bounds on the Space Complexity of Detectable Object

by Ohad Ben-Baruch

2024, arXiv (Cornell University)

The emergence of systems with non-volatile main memory (NVM) increases the interest in the design of recoverable concurrent objects that are robust to crash-failures, since their operations are able to recover from such failures by using... more

descriptionView Paper arrow_downwardDownload

Automatic Extraction of a Document-oriented NoSQL Schema

by Amal Brahim

2024, Proceedings of the 23rd International Conference on Enterprise Information Systems

The NoSQL systems make it possible to manage Databases (DB) verifying the 3Vs: Volume, Variety and Velocity. Most of these systems are characterized by the property schemaless which means absence of the data schema when creating a DB.... more

descriptionView Paper arrow_downwardDownload

Selective caching: a persistent memory approach for multi-dimensional index structures

by Muhammad Abba Jibril

2024, Distributed and Parallel Databases

After the introduction of Persistent Memory in the form of Intel’s Optane DC Persistent Memory on the market in 2019, it has found its way into manifold applications and systems. As Google and other cloud infrastructure providers are... more

descriptionView Paper arrow_downwardDownload

Mechanisms for application-level recoverable-persistence in a single address space

by Gianluca Dini

2024, Microprocessors and Microsystems

In this paper we consider mechanisms for supporting recoverable-persistence in a single address space memory model. In particular, we consider a memory management system, MMS, for a persistent single address space and show how the... more

descriptionView Paper arrow_downwardDownload

Providing orthogonal persistence for Java

by Malcolm Atkinson

2024, Lecture Notes in Computer Science

descriptionView Paper arrow_downwardDownload

An orthogonally persistent Java

by Malcolm Atkinson

2024, ACM SIGMOD Record

The language Java is enjoying a rapid rise in popularity as an application programming language. For many applications an effective provision of database facilities is required. Here we report on a particular approach to providing such... more

descriptionView Paper arrow_downwardDownload

Fast Nonblocking Persistence for Concurrent Data Structures

by Mingzhe Du

2024, arXiv (Cornell University)

We present a fully lock-free variant of our recent Montage system for persistent data structures. The variant, nbMontage, adds persistence to almost any nonblocking concurrent structure without introducing significant overhead or blocking... more

descriptionView Paper arrow_downwardDownload

Casper: a cached architecture supporting persistence

by Francis Vaughan

2024, Computing Systems

Persistent object systems greatly simplify programming tasks since they hide the traditional distinction between short-term and long-term storage from the applications programmer. As a result, the programmer can operate at a level of... more

descriptionView Paper arrow_downwardDownload

Persistence In the Grasshopper Kernel

by Francis Vaughan

2024, AUSTRALIAN …

descriptionView Paper arrow_downwardDownload

Weaving Rules into [email protected] for Embedded Smart Systems

by Yves Le Traon

2023

Smart systems are characterised by their ability to analyse measured data in live and to react to changes according to expert rules. Therefore, such systems exploit appropriate data models together with actions, triggered by... more

descriptionView Paper arrow_downwardDownload

Persistence in NICMOS: Results from On-Orbit data

by Doris Daou

2023

This ISR presents the results of the analysis of NICMOS persistence data taken as part of the Servicing Mission Orbital Verification (SMOV). This test is a sequel to the System Level Thermal Vacuum (SLTV) persistence tests performed with... more

descriptionView Paper arrow_downwardDownload

Uncovering Steady State Executions in Java Microbenchmarking with Call Graph Analysis

by Sneh Patel

2023, Companion of the 2023 ACM/SPEC International Conference on Performance Engineering

Developers often use microbenchmarking tools to evaluate the performance of a Java program. These tools run a small section of code multiple times and measure its performance. However, this process can be problematic as Java execution is... more

descriptionView Paper arrow_downwardDownload

Anti-Persistence on Persistent Storage

by David Zage

2023, Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems

We present history-independent alternatives to a B-tree, the primary indexing data structure used in databases. A data structure is history independent (HI) if it is impossible to deduce any information by examining the bit representation... more

descriptionView Paper arrow_downwardDownload

Constructing Database Systems in a Persistent Environment

by Malcolm Atkinson

2023, Very Large Data Bases

The goal of the Persistent Programming Research Group is the provision of an environment which incorporates the principle of orthogonal persistence in order to facilitate the production of large and complex software. A database management... more

descriptionView Paper arrow_downwardDownload

The Napier Type System

by Malcolm Atkinson

2023, Workshops in Computing

Here we describe a browser that provides a two and a half dimensional viewing mechanism for persistent data structures. The browser is an adaptive program which learns about its environment; this knowledge is stored in the persistent... more

descriptionView Paper arrow_downwardDownload

Selective caching: a persistent memory approach for multi-dimensional index structures

by Kai-Uwe Sattler

2023, Distributed and Parallel Databases

descriptionView Paper arrow_downwardDownload

Dynamic data models: an application of MOP-based persistence in Common Lisp

by Pierre Thierry

2023

descriptionView Paper arrow_downwardDownload

Selective caching: a persistent memory approach for multi-dimensional index structures

by Muhammad Jibril

2023, Distributed and Parallel Databases

descriptionView Paper arrow_downwardDownload

PCLOS: a critical review

by A. Paepcke

2023, ACM SIGPLAN Notices

This paper uses the persistent object system PCLOS to survey some problems and benefits of object persistence. The system is analyzed along several relevant dimensions. PCLOS provides object persistence for an object-oriented language.... more

descriptionView Paper arrow_downwardDownload

KITTy: A PACKAGE FOR EXTERNAL PATCHES COMMUNICATION MANAGEMENT IN MAX/MSP–A PROGRESS REPORT

by Paulo Ferreira-Lopes

2023, music.mcgill.ca

descriptionView Paper arrow_downwardDownload

Selective caching: a persistent memory approach for multi-dimensional index structures

by David Broneske

2023, Distributed and Parallel Databases

Fig. 7 Continuous throughput of cached Elf variants for partial-match queries

Fig. 5 Continuous throughput of cached Elf variants for exact-match queries 3) The access to the always persistent MonoList for each query (& 50% per- formance impact).

several columns. The idea of MonoLists is that whenever there is no branch-out on deeper levels, the linked lists are merged to a single MonoList, thus eliminating pointers and distributed storage. To this end, on the upper level, Elf is similar to a column store. On deeper levels, it slowly converges to a row-store-like layout. This effectively compresses the data set [5].

partial-match queries use a range size of 2% and 100% per set dimension to demon- strate the extremes. As expected, DRAM exhibits a better performance than PMem. The overhead of building the Elf and of executing the three query types yield to 18%, 223%, 210%/70%, and 236%/66%, respectively. For range and partial-match queries with a higher range size, the runtimes of the volatile and persistent versions are much closer. Greater ranges lead to more sequential access pattern and more commonly traversed DimensionLists which will end up in the CPU caches for both the volatile and the persistent version. This is not the case for most exact-match queries due to their tiny query windows. Altogether, our results show that the per- formance gap between DRAM and PMem is wider for queries—especially exact- match queries—than for building. Our explanation is that a sequential access pattern is better supported on PMem than a random one. Particularly during building, the write-combining buffer of PMem seems to be quite efficient if there is only a single sequentially writing thread. In the following experiments, we primarily focus on a low selection percentage since random access patterns offer the greatest potential for improvement by selective caching.

virtual object pointer.! The linearized Elf array is stored separately and reachable from the Elf object. Similarly, the virtual add ress of the array is stored to avoic costly persistent dereferencing. Another drawback of persistent pointers is that the} are twice the size of virtual pointers. Actually, it is not necessary to store volatile pointers in the persistent pool, but it helps wit h the visualization of our utilizatior of them. To ensure atomicity, we used libpmemobj transactions in memory alloca. tions for the persistent Elf object, the persistent Elf array, and the index build. We additionally wrap the member variables of the and the number of dimensions, with the persis persistent Elf class, such as the size: tent property class. Although in out experiments we do not modify the tree after initially building it, this is reasonable for later inserts or in-place updates. Due to its size, the data array is not wrapped a: one persistent property. Instead, the modified r ally added to a transaction. Hwyhrid FIF In the hvhrid FIf we nrannce anges in the array need to be manu. ta create a volatile cony in DNRAN

to identify their optimal setting. On top of that, we added two baselines. First, the pure PMem-based variant (labeled as w/o caching) and second, a hybrid variant, which this time, however, only accesses the DRAM copy for DimensionLists and always obtains MonoLists via the PMem copy of the Elf (labeled as dual access).’ The results are shown in Fig. 4a for exact-match queries and Fig. 4b for range queries.

lueries), on the uniform data, it can outperform the pure PMem-based Elf by up to ‘5%. For the TPC-H data only 1-2% were possible. For static caching, we omitted ome configurations to keep the figures clean. Caching the first one or two levels tatically has similar performance and leads already to a little better initial perfor- nance. The peak for this setup could be achieved when statically caching the first our levels. Caching three or more than four levels behaves similarly to static 6 levels n Fig. 5a. The correlated Lineitem table reached its peak performance with eleven evels and higher. Again, the behaviour is not linear as, e.g., for four levels the per- ormance goes up, with five levels down again, and from nine levels it gets con- inuously better. Contrary to what we assumed before [17], more caching levels do lot necessarily result in better performance. Rather, the size compared to the CPU aches, successful branch predictions, the commonly accessed DimensionLists, nd again the size of the underlying hash table’ are more important. For Fig. 5a, limension levels one and two completely fit in the L1 cache, level three is slightly arger than L2, and all others are greater than the LLC. For instance, four levels equire 136 MiB of DRAM which is 10x the LLC. Compared to the total size of Elf this is merely 3% space overhead for about a 30% performance boost. Adding the lynamic cache on top of the static cache with four levels leads to the best currently ichieved performance. It also increases the throughput for the TPC-H data. Here, we combined it with nine static levels since with eleven, the static and dynamic caches vould contain almost the same DimensionLists and there would be no more JimensionLists left to be cached by the dynamic part. As mentioned earlier, ve reordered the columns of the TPC-H Lineitem table to exploit prefix redundan- ies. Unique columns do not allow for DimensionLists. After reordering the olumns, the dimensions twelve to fifteen were the primary and foreign keys, which lave no DimensionLists except for a relatively few in dimension twelve. Thus he limitation of combining the dynamic cache with nine static levels instead of

descriptionView Paper arrow_downwardDownload

Implementing Optimistic Concurrency Control for Persistence Middleware Using Row Version Verification

by Martti Laiho

2023, 2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications

Modern web-based applications are often built as multi-tier architecture using persistence middleware. Middleware technology providers recommend the use of Optimistic Concurrency Control (OCC) mechanism to avoid the risk of blocked... more

Figure 1. Transactions and isolation levels of the sample use case

descriptionView Paper arrow_downwardDownload

JuxMem: An Adaptive Supportive Platform for Data Sharing on the Grid

by mathieu jan

2023, Scalable Computing: Practice and Experience

We address the challenge of managing large amounts of numerical data within computing grids consisting of a federation of clusters. We claim that storing, accessing, updating and sharing such data should be considered by applications as... more

descriptionView Paper arrow_downwardDownload

GDS: An Architecture Proposal for a Grid Data-Sharing Service

by mathieu jan

2023, Future Generation Grids

Grid computing has recently emerged as a response to the growing demand for resources (processing power, storage, etc.) exhibited by scientific applications. We address the challenge of sharing large amounts of data on such... more

descriptionView Paper arrow_downwardDownload

GDS: An Architecture Proposal for a Grid Data-Sharing Service

by P. Sens

2023, Future Generation Grids

descriptionView Paper arrow_downwardDownload

MetaData persistence using storage class memory

by jithin Jose

2023, Proceedings of the 1st Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads

Storage Class Memory (SCM) blends the best properties of main memory and hard disk drives. It offers non-volatility and byte addressability, and promises short access times with low cost per bit. Earlier research in this field explored... more

descriptionView Paper arrow_downwardDownload

(O 3) 2: From "poor-man's persistence" to transparent clustering for Java applications

by Pedro Sampaio

2023, Middleware'10 Posters and Demos Track, Middleware Posters'10

descriptionView Paper arrow_downwardDownload

Fast and Lean Immutable Multi-Maps on the JVM based on Heterogeneous Hash-Array Mapped Tries

by Jurgen Vinju

2023, ArXiv

textabstractAn immutable multi-map is a many-to-many thread-friendly map data structure with expected fast insert and lookup operations. This data structure is used for applications processing graphs or many-to-many relations as applied... more

descriptionView Paper arrow_downwardDownload

Towards a software product line of trie-based collections

by Jurgen Vinju

2023, Proceedings of the 2016 ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences

Collection data structures in standard libraries of programming languages are designed to excel for the average case by carefully balancing memory footprint and runtime performance. These implicit design decisions and hard-coded... more

descriptionView Paper arrow_downwardDownload

Optimizing hash-array mapped tries for fast and lean immutable JVM collections

by Jurgen Vinju

2023, Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications

The data structures underpinning collection API (e.g. lists, sets, maps) in the standard libraries of programming languages are used intensively in many applications. The standard libraries of recent Java Virtual Machine languages, such... more

Table 2. Runtimes of Clojure, Scala, and CHAMP for CFG dominators experiment per CFG count. All libraries are un- modified.

Figure 2. HAMT-based sets with values in internal nodes versus values at the leaves only.

Table 3. Runtimes of Clojure, Scala, and CHAMP for CFG dominators experiment per CFG count. Scala and CHAMP were modified to calculate hash codes lazily, such as Clojure does.

Figure 4. Runtime and memory savings of CHAMP compared to Clojure’s PersistentHash{Map, Set}.

Figure 6. Runtime and memory savings of MEMCHAMP compared to Clojure’s PersistentHash{Map, Set}.

Figure 1. Inserting three integers into a HAMT-based set (la, 1b, and Ic), on basis of their hashes (1d). Figure le shows an equivalent and collision-free array-based hash set, with prime number table size 7 and load factor of 75 %.

Figure 7. Runtime and memory savings of MEMCHAMP compared to Scala’s immutable.Hash{Map, Set}.

Figure 5. Runtime and memory savings of CHAMP compared to Scala’s immutable.Hash{Map, Set}.

Table 4. Preliminary measurements of Last-Level Cache misses for data structures of size 27°. The number in brackets illustrate how much CHAMP reduces cache misses over the other implementations.

Figure 3. HAMT-based map implementations with values in internal nodes (various variants). The index numbers in the top left corners denote the logical indices for each key/value entry and not their physical indices. Figure 3a specifically shows Clojure’s HAMT implementation that indicates sub-nodes by leaving the array slot for the key empty.

descriptionView Paper arrow_downwardDownload

Implementing Partial Persistence in Object-Oriented Languages

by R. Wuyts

2023, 2008 Proceedings of the Tenth Workshop on Algorithm Engineering and Experiments (ALENEX)

descriptionView Paper arrow_downwardDownload

Experimental Performance Evaluation of different Data Models for a Reflection Software Architecture over NoSQL Persistence Layers

by Enrico Vicario

2022, Proceedings of the 7th ACM/SPEC on International Conference on Performance Engineering

The recent rise of the NoSQL movement motivates investigation on the performance impact that new persistence approaches can bring in the model-driven re-engineering of a consolidated object-oriented software architecture. We report... more

descriptionView Paper arrow_downwardDownload

(O 3) 2: From "poor-man's persistence" to transparent clustering for Java applications

by Pedro Sampaio

2022, Middleware'10 Posters and Demos Track, Middleware Posters'10

descriptionView Paper arrow_downwardDownload

Persistent Data Structure

Key research themes

1. How can persistent memory technology improve the design and performance of persistent data structures?

2. What methods enable efficient incremental maintenance and querying of graph-structured persistent data?

3. How can persistent object-oriented models effectively manage large volumes of structured data with efficient storage and querying, especially when using programming languages like C++?

Related Topics

All papers in Persistent Data Structure