Compiler Optimization

description1,376 papers

group226 followers

lightbulbAbout this topic

Compiler optimization is the process of improving the performance and efficiency of compiled code by transforming the intermediate representation of a program. This involves techniques that enhance execution speed, reduce memory usage, and minimize resource consumption, while preserving the program's correctness and intended functionality.

lightbulbAbout this topic

Key research themes

1. How can machine learning and AI techniques improve compiler optimization across varying applications and architectures?

This theme focuses on leveraging machine learning (ML) and artificial intelligence (AI) methods to automate and enhance compiler optimization strategies. It addresses the challenge of adapting compiler heuristics to complex program features and diverse microarchitectures, enabling compilers to learn effective optimization passes dynamically and predictively. The goal is to improve execution efficiency, resource utilization, and performance portability while reducing human effort in tuning and retargeting compilers for new programs and hardware.

Memory Utilization and Machine Learning Techniques for Compiler Optimization

by Siddarth Singaravel

2021, ITM Web of Conferences

Key finding: This paper surveys cache optimization and multi-memory allocation features, highlighting that machine learning (ML) techniques can guide sustainable computing strategies by intelligently selecting optimization methods... Read more

articleView Paper downloadDownload

Portable compiler optimisation across embedded programs and microarchitectures using machine learning

by Edwin Bonilla

2015

Key finding: The paper presents a novel ML model that predicts the best optimization passes for any new program on varying microarchitectural configurations, enabling automatic adaptation without retuning. Across 200 microarchitecture... Read more

articleView Paper downloadDownload

Automatic feature generation for machine learning--based optimising compilation

by Edwin Bastidas Bonilla

2021, ACM Transactions on Architecture and Code Optimization

Key finding: This study introduces a grammar-based genetic programming framework to automatically generate effective program features for ML models used in compiler heuristics. Applied to loop unrolling optimization in GCC, automatically... Read more

articleView Paper downloadDownload

Advancements in AI-Based Compiler Optimization Techniques for Machine Learning Workloads

by Vasuki Shankar

2025, International Journal of Computer Sciences and Engineering

Key finding: The paper demonstrates that AI-based compiler optimizations, leveraging reinforcement learning and neural architecture search, outperform traditional techniques in optimizing machine learning workloads in terms of energy... Read more

articleView Paper downloadDownload

GCDS: A compiler strategy for trading code size against preformance in embedded applications

by Z. Chamski

2021

Key finding: This work proposes a global constraints-driven strategy (GCDS) using multiple optimization sequences and a posteriori evaluation of code size vs. performance trade-offs across entire applications rather than individual loops.... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

2. What advanced loop transformation and data locality optimizations can compilers implement to improve performance on modern architectures?

This theme covers compiler techniques focused on high-level loop transformations such as induction variable analysis, scalar evolution, loop interchange, skewing, and vectorization enhancements to improve data locality, parallelism, and memory hierarchy utilization. Efficient management of loop-carried dependencies and leveraging advanced analyses like dependence analysis enable compilers to unlock more effective loop-level optimizations, crucial for performance on systems with deep memory hierarchies and parallel processors.

High-Level Loop Optimizations for GCC

by David Edelsohn

2016

Key finding: The paper details the design of a GCC infrastructure leveraging TreeSSA for improved induction variable and scalar evolution analysis combined with data dependence tests, enabling a matrix-based approach to safely and... Read more

articleView Paper downloadDownload

Compiler-Directed Transformation for Higher-Order Stencils

by Samuel Williams

2022, 2015 IEEE International Parallel and Distributed Processing Symposium

Key finding: By introducing a novel partial sums reordering transformation that exploits symmetry and common subexpressions in high-order stencil computations, this compiler optimization significantly reduces floating-point operations and... Read more

articleView Paper downloadDownload

URECA: A Compiler Solution to Manage Unified Register File for CGRAs

by Shail Dave

2018, Design, Automation & Test in Europe Conference & Exhibition (DATE)

Key finding: URECA introduces a compiler-managed unified nonrotating register file (RF) for CGRA architectures that efficiently handles both recurring and nonrecurring loop variables by dynamically partitioning the RF and preloading... Read more

articleView Paper downloadDownload

Integer affine transformations of parametric Z-polytopes and applications to loop nest optimization

by Benoit Meister

2022

Key finding: The paper proposes new polynomial-time algorithms for counting and computing integer affine transformations of unions of parametric Z-polytopes—a mathematical abstraction critical for analyzing loop nests with parameters. The... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

3. How can compiler frameworks enable higher-level programming models and integrate heterogeneous computing workflows effectively?

This theme investigates compiler design approaches that bridge high-level programming abstractions—such as tasks, parallel loops, and domain-specific functions—with low-level hardware execution models, especially in heterogeneous systems. The objective is to combine programmability, portability, and performance by transforming high-level constructs (e.g., OpenMP tasks) into efficient execution engines like CUDA graphs or symbolic compilation frameworks, facilitating automated parallelization and hybrid CPU-GPU programming.

OpenMP to CUDA graphs

by chenle YU

2022, Proceedings of the 23th International Workshop on Software and Compilers for Embedded Systems

Key finding: This paper presents a novel compiler transformation that converts OpenMP tasking and accelerator model code into CUDA graphs by representing OpenMP programs as static task dependency graphs (TDGs). The approach uncouples... Read more

articleView Paper downloadDownload

TAM: A Front-End to an Auto-Parallelizing Compiler

by Gaurav Singal

2022, ACM Transactions on Asian and Low-Resource Language Information Processing

Key finding: TAM is a parallelizing compiler front-end that parallelizes all stages of compilation (lexical, syntax, semantic analysis, and IR generation) using data dependency graphs to maximize utilization of multicore CPUs. It... Read more

articleView Paper downloadDownload

Grisette: Symbolic Compilation as a Functional Programming Library

by 思睿卢

2023, Proceedings of the ACM on Programming Languages

Key finding: Grisette provides a purely functional, statically typed symbolic evaluation framework implemented as a library, allowing symbolic compilation and reasoning about all program paths with merged states using ordered-guards... Read more

articleView Paper downloadDownload

Automatic compiler/interpreter generation from programs for Domain-Specific Languages: Code bloat problem and performance improvement

by Miha Ravber

2024

Key finding: The authors propose methods combining semantic inference with multi-threading and reduced input sampling to automatically generate compiler and interpreter components for domain-specific languages (DSLs) directly from example... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

All papers in Compiler Optimization

In search of a program generator to implement generic transformations for high-performance computing

by Christoph Herrmann

2026, Science of Computer Programming

The quality of compiler-optimized code for high-performance applications is far behind what optimization and domain experts can achieve by hand. Although it may seem surprising at first glance, the performance gap has been widening over... more

descriptionView Paper arrow_downwardDownload

NISQ-Era Quantum Simulation of Lattice Gauge Theories: A Review of Encoding Strategies, Variational Methods, and Error Mitigation Techniques

by Mirza Adnan Mohtashim

2026, Zenodo

Quantum simulation of lattice gauge theories (LGTs) represents one of the most promising applications of near-term intermediate-scale quantum (NISQ) devices. This review provides a comprehensive synthesis of recent developments in... more

descriptionView Paper arrow_downwardDownload

Optimization Techniques for Matrix Multiplication Kernels in Linear Algebra Libraries: A CPU-Focused Approach

by IJCSMC Journal

2026, International Journal of Computer Science and Mobile Computing (IJCSMC)

Matrix multiplication is a fundamental operation in linear algebra libraries, serving as the computational backbone for scientific computing, machine learning, and data analytics applications. This paper presents a comprehensive analysis... more

descriptionView Paper arrow_downwardDownload

Compiler Optimization Pass Visualization

by Weijia Shang

2026, ACM Transactions on Computing Education

There is an active research community concentrating on visualizations of algorithms taught in CS1 and CS2 courses. These visualizations can help students to create concrete visual images of the algorithms and their underlying concepts.... more

descriptionView Paper arrow_downwardDownload

Finding Best Compiler Options for Critical Software Using Parallel Algorithms

by Enrique Alba

2026, Studies in computational intelligence

The efficiency of a software piece is a key factor for many systems. Real-time programs, critical software, device drivers, kernel OS functions and many other software pieces which are executed thousands or even millions of times per day... more

descriptionView Paper arrow_downwardDownload

A Genetic Algorithm Approach to Scheduling Communications for a Class of Parallel Space-Time Adaptive Processing Algorithms

by John Antonio

2026, Springer eBooks

The work described here introduces a practical and accurate tool for predicting power consumption for FPGA circuits. The utility of the tool is that it enables FPGA circuit designers to evaluate the power consumption of their designs... more

descriptionView Paper arrow_downwardDownload

An Aho-Corasick Based Assessment of Algorithms Generating Failure Deterministic Finite Automata

by Madoda Madoda

2026

The Aho-Corasick algorithm derives a failure deterministic finite automaton for finding matches of a finite set of keywords in a text. It has the minimum number of transitions needed for this task. The DFA-Homomorphic Algorithm (DHA)... more

descriptionView Paper arrow_downwardDownload

Reliable and Precise WCET and Stack Size Determination for a Real-life Embedded Application

by Philippe Baufreton

2026, ISoLA

Failure of a safety-critical application on an embedded processor can lead to severe damage or even loss of life. Here we are concerned with two kinds of failure: stack overflow, which usually leads to run-time errors that are difficult... more

descriptionView Paper arrow_downwardDownload

Compilation and Embedded Computing Systems

by Hadda Cherroun

2026

Compsys is located at Ecole normale supérieure de Lyon.

descriptionView Paper arrow_downwardDownload

Autonomous Quality Agents: Policy-Driven Test Generation and Intelligent Orchestration for Continuous Software Assurance

by Srikanth C Vankayala

2026, EJAET

The increasing complexity of cloud-native, distributed, and AI-enabled software systems has rendered traditional, static quality assurance (QA) practices increasingly inadequate, as fixed test suites and manually curated strategies... more

descriptionView Paper arrow_downwardDownload

Compiler optimization on instruction scheduling for low power

by TingTing Hwang

2026, Proceedings 13th International Symposium on System Synthesis

descriptionView Paper arrow_downwardDownload

Compiler optimization on VLIW instruction scheduling for low power

by TingTing Hwang

2026, ACM Transactions on Design Automation of Electronic Systems

In this article, we investigate compiler transformation techniques regarding the problem of scheduling VLIW instructions aimed at reducing power consumption of VLIW architectures in the instruction bus. The problem can be categorized into... more

descriptionView Paper arrow_downwardDownload

Caracal

by David Kaeli

2026, Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units

Graphics Processing Units (GPU) have become the platform of choice for accelerating a large range of data parallel and task parallel applications. Both AMD and NVIDIA have developed GPU implementations targeted at the high performance... more

descriptionView Paper arrow_downwardDownload

Static memory access pattern analysis on a massively parallel GPU

by David Kaeli

2026, SAAHPC. ACM

The performance of data-parallel processing can be highly sensitive to any contention in memory. In contrast to multi-core CPUs which employ a number of memory latency minimization techniques such as multi-level caching and prefetching,... more

descriptionView Paper arrow_downwardDownload

Compiler Optimizations for Transaction Processing Workloads on Itanium® Linux Systems

by Gerolf Hoflehner

2026

This paper discusses a repertoire of well-known and new compiler optimizations that help produce excellent server application performance and investigates their performance contributions. These optimizations combined produce a 40%... more

descriptionView Paper arrow_downwardDownload

Quantitative evaluation of the register stack engine and optimizations for future Itanium processors

by Gerolf Hoflehner

2026, Proceedings Sixth Annual Workshop on Interaction between Compilers and Computer Architectures

This paper examines the efficiency of the register stack engine (RSE) in the canonical Itanium architecture, and introduces novel optimization techniques to enhance the RSE performance. To minimize spills and fills of the physical... more

descriptionView Paper arrow_downwardDownload

Compiler Optimizations for Transaction Processing Workloads on Itanium® Linux Systems

by Gerolf Hoflehner

2026, 37th International Symposium on Microarchitecture (MICRO-37'04)

descriptionView Paper arrow_downwardDownload

NAPA C: compiling for a hybrid RISC/FPGA architecture

by Maya Gokhale

2026, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251)

Hybrid architectures combining conventional processors with con gurable logic resources enable ecient coordination of control with datapath computation. With integration of the two components on a single device, loop control and... more

descriptionView Paper arrow_downwardDownload

Advanced Optimizations in Modern Compilers: JIT, AOT and Hybrid Pipelines

by Jonathan Monkila

2026, Advanced Optimizations in Modern Compilers: JIT, AOT and Hybrid Pipelines

Modern software demands high performance, portability, and adaptability, driving innovations in compiler technologies. This article investigates advanced optimization strategies in modern compilers, focusing on Just-In-Time (JIT),... more

descriptionView Paper arrow_downwardDownload

Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems - Report on the Workshop ICOOOLPS'2007 at ECOOP'07

by Philippe Mulet

2026, arXiv (Cornell University)

descriptionView Paper arrow_downwardDownload

Is continuation-passing useful for data flow analysis?

by Matthias Felleisen

2025, ACM SIGPLAN Notices

The widespread use of the continuation-passing style (CPS) transformation in compilers, optimizers, abstract interpreters, and partial evaluators reflects a common belief that the transformation has a positive effect on the analysis of... more

descriptionView Paper arrow_downwardDownload

Quantum Software Engineering: Algorithm Design, Error Mitigation, and Compiler Optimization for Fault-Tolerant Quantum Computing

by Editor IJCATR

2025, International Journal of Computer Applications Technology and Research

Quantum computing is poised to revolutionize computational paradigms by leveraging quantum mechanics principles such
as superposition and entanglement. However, the full-scale deployment of quantum applications remains constrained by hardware
limitations, including high error rates and quantum decoherence. Quantum Software Engineering (QSE) emerges as a critical field
addressing these challenges by optimizing algorithm design, error mitigation, and compiler strategies to enhance fault tolerance.
Algorithm design in QSE focuses on developing quantum algorithms that efficiently exploit quantum parallelism while minimizing
resource overhead. Key advancements include quantum variational algorithms, hybrid quantum-classical frameworks, and novel
quantum heuristics tailored for optimization and cryptographic problems. Error mitigation techniques play a pivotal role in extending
quantum circuit reliability without requiring full quantum error correction. Methods such as zero-noise extrapolation, probabilistic
error cancellation, and quantum embedding techniques help reduce computational inaccuracies. Additionally, compiler optimization
ensures efficient quantum program execution by minimizing gate depth, optimizing qubit mapping, and leveraging noise-adaptive
scheduling to enhance quantum hardware performance. This paper explores the synergy between these three pillars of QSE, analyzing
their impact on improving the feasibility of fault-tolerant quantum computing. It also examines emerging trends, including AI-driven
quantum compilers, adaptive error mitigation techniques, and hardware-aware quantum software development. By bridging the gap
between theoretical advancements and practical implementations, QSE provides a structured approach to accelerating quantum
computing adoption across domains such as cryptography, materials science, and artificial intelligence. The findings underscore the
necessity of interdisciplinary collaboration in developing robust quantum software solutions that maximize computational efficiency
while mitigating inherent quantum hardware limitations.

descriptionView Paper arrow_downwardDownload

Influence of procedure cloning on WCET prediction

by P. Marwedel

2025

For the worst-case execution time (WCET) analysis, especially loops are an inherent source of unpredictability and loss of precision. This is caused by the difficulty to obtain safe and tight information on the number of iterations... more

descriptionView Paper arrow_downwardDownload

Automatic non-functional testing and tuning of configurable generators

by Mohamed Boussaa

2025

This thesis would not have been completed without the help of others. I would like to take this opportunity to express my gratitude towards them and acknowledge them. First of all, I would like to offer my deepest gratitude to my... more

descriptionView Paper arrow_downwardDownload

The Design of Very Fast Portable Compilers

by Iulia Sa

2025

The Amsterdam Compiler Kit is a widely used compiler building system. Up until now, the emphasis has been on producing good object code. In this paper we describe recent work that has focused on reducing compile time. The techniques... more

descriptionView Paper arrow_downwardDownload

Implementation of general formal translators (2)

by Iosif Iulian Petrila

2025

The general translator formalism and computing specific implementations are proposed. The implementation of specific elements necessary to process the source and destination information within the translators are presented. Some common... more

descriptionView Paper arrow_downwardDownload

Implementation of general formal translators

by Iosif Iulian Petrila

2025

descriptionView Paper arrow_downwardDownload

Using Polyvariant Union-Free Flow Analysis to Compile aHigher-Order Functional-Programming Language with aFirst-Class Derivative Operator to Efficient Fortran-like Code

by Barak Pearlmutter

2025

We exhibit an aggressive optimizing compiler for a functionalprogramming language which includes a first-class forward automatic differentiation (AD) operator. The compiler's performance is competitive with FORTRAN-based systems on our... more

descriptionView Paper arrow_downwardDownload

Using polyvariant union-free flow analysis to compile a higher-order functional-programming language with a first-class derivative operator to efficient Fortran-like code

by Barak Pearlmutter

2025

descriptionView Paper arrow_downwardDownload

Cassyopia: compiler assisted system optimization

by Matti Hiltunen

2025, Workshop on Hot Topics in Operating Systems

Execution of a program almost always involves multiple address spaces, possibly across separate machines. Here, an approach to reducing such costs using compiler optimization techniques is presented. This paper elaborates on the overall... more

descriptionView Paper arrow_downwardDownload

Greg Chaitin, Computer Programmer

by Virginia Chaitin and

2025

A colleague describes working with Greg at IBM Research.

descriptionView Paper arrow_downwardDownload

A Software Pipelining Framework for Simple Processor Cores

by Dee Lee

2025

Current trends in many-core architectures show a switch from a small number of architecturally sophisticated cores (e.g. Intel Core2, IBM PowerPC) to many simple cores (e.g SiCortex and Tilera multiprocessor). These simple cores lack many... more

descriptionView Paper arrow_downwardDownload

How to be correct, lazy and efficient ?

by Catherine Recanati

2025

This paper is an introduction to Lambdix, a lazy Lisp interpreter implemented at the Research Laboratory of Paris XI University (Laboratoire de Recherche en Informatique, Orsay). Lambdix was devised in the course of an investigation into... more

descriptionView Paper arrow_downwardDownload

Compiler Optimization

Key research themes

1. How can machine learning and AI techniques improve compiler optimization across varying applications and architectures?

2. What advanced loop transformation and data locality optimizations can compilers implement to improve performance on modern architectures?

3. How can compiler frameworks enable higher-level programming models and integrate heterogeneous computing workflows effectively?

Related Topics

All papers in Compiler Optimization