A Physics-Informed Meta-Learning Framework
for the Continuous Solution of Parametric PDEs
on Arbitrary Geometries
Abstract
{adjustwidth}-1cm-1cm In this work, we introduce implicit Finite Operator Learning (iFOL) for the continuous and parametric solution of partial differential equations (PDEs) on arbitrary geometries. We propose a physics-informed encoder-decoder network to establish the mapping between continuous parameter and solution spaces. The decoder constructs the parametric solution field by leveraging an implicit neural field network conditioned on a latent or feature code. Instance-specific codes are derived through a PDE encoding process based on the second-order meta-learning technique. In training and inference, a physics-informed loss function is minimized during the PDE encoding and decoding. iFOL expresses the loss function in an energy or weighted residual form and evaluates it using discrete residuals derived from standard numerical PDE methods. This approach results in the backpropagation of discrete residuals during both training and inference.
iFOL features several key properties: (1) its unique loss formulation eliminates the need for the conventional encode-process-decode pipeline previously used in operator learning with conditional neural fields for PDEs; (2) it not only provides accurate parametric and continuous fields but also delivers solution-to-parameter gradients without requiring additional loss terms or sensitivity analysis; (3) it can effectively capture sharp discontinuities in the solution; and (4) it removes constraints on the geometry and mesh, making it applicable to arbitrary geometries and spatial sampling (zero-shot super-resolution capability). We critically assess these features and analyze the network’s ability to generalize to unseen samples across both stationary and transient PDEs. The overall performance of the proposed method is promising, demonstrating its applicability to a range of challenging problems in computational mechanics.
Keywords Parameterized PDEs Physics-informed operator learning Conditional neural field PDE encoding Second-order meta-learning
1 Introduction
Numerical Solvers. Numerical methods for solving nonlinear PDEs have been central to scientific computing, enabling the simulation of complex physical phenomena. Classical approaches such as the Finite Element Method (FEM), Finite Difference Method (FDM), and Spectral Methods are widely used due to their robustness and versatility. FEM excels in handling complex geometries and boundary conditions, while FDM is known for its simplicity and efficiency on structured grids. Spectral methods, on the other hand, offer high accuracy for problems with smooth solutions by leveraging global basis functions. Despite their success, these methods often require significant computational resources, especially for high-dimensional problems, as they need to be repeated for any new set of input parameters.
Physics-Informed Neural Networks. Physics-Informed Neural Networks (PINN) have emerged as a promising alternative, leveraging neural networks to approximate solutions to PDEs by embedding the governing equations directly into the loss function Raissi et al. (2019). PINNs are particularly attractive for solving problems with sparse or incomplete data. In specific cases where all governing equations and boundary conditions are known, PINNs can be employed as forward solvers to compute the solutions. Here, the term solver refers to finding solutions that satisfy the governing equations, translating the problem into a constrained optimization task. However, this approach faces two significant challenges: (1) the training time for PINNs is still not competitive with traditional numerical solvers, except in very high-dimensional settings where classical methods struggle Rezaei et al. (2022); Grossmann et al. (2024), and (2) even at their best (considering recent advances), standard PINNs perform comparably to methods like FEM, and the solutions lack generalizability to other parametric inputs. Consequently, the core limitations of traditional numerical methods, such as computational inefficiency and lack of flexibility for parametric variations, persist when using PINNs solely for forward problems. The performance of PINNs in inverse problems, model discovery, and calibration differs from their application in forward problems, with several reported advantages in these contexts Faroughi et al. (2024).
It’s worth mentioning that, as a current trend, researchers are also combining ideas from FEM and neural networks to develop new deep learning-based solvers. See, for example, the following studies:Mitusch et al. (2021); Škardová et al. (2024); Xiong et al. (2025); Li et al. (2024). Although these approaches offer appealing features, particularly in terms of implementation simplicity, their computational time remains comparable to the FEM, and their applicability to highly nonlinear multiphysics problems has yet to be demonstrated.
Neural Operators. Neural Operators (NOs) extend the capabilities of traditional machine learning by learning mappings between infinite dimensional function spaces, making them powerful tools for solving parametric PDEs and modeling complex physical systems. NOs ideally should generalize across entire families of functions, enabling them to predict solutions for varying input conditions, such as material properties or boundary conditions. Rather well-known approaches for operator learning so far are Deep Neural Operator (DeepOnet) and its extensions Lu et al. (2021); Abueidda et al. (2025); He et al. (2024); Kumar et al. (2025); Yu et al. (2024), Physics-informed DeepOnet Wang et al. (2021); Mandl et al. (2025); Li et al. (2025), Fourier Neural Operator (FNO) and its extensions Li et al. (2021); Azizzadenesheli et al. (2024); Li et al. (2023a, b), Unet Ronneberger et al. (2015); Chen et al. (2019); Mendizabal et al. (2020); Mianroodi et al. (2022); Gupta et al. (2023); Najafi Koopas et al. (2025) and others.
The mentioned methods are also continuously improving their generalizability to complex geometries and their ability to handle complex solution fields. For example, Yin et al. (2024); Li et al. (2023a); Xiao et al. (2024); Zeng et al. (2025); Chen et al. (2024a) introduce various frameworks designed to learn geometry-dependent solution operators for different PDEs efficiently. Similar to the previous section, the combination of physics-informed NO and classical numerical solvers such as FEM, FDM, and FFT are also getting more attention, see for example Yamazaki et al. (2025a),Rezaei et al. (2025a), Xu et al. (2024), Eshaghi et al. (2025), Lee et al. (2025),Kaewnuratchadasorn et al. (2024), Franco et al. (2023) and Harandi et al. (2025). The latter helps avoid time-consuming data generation or reduces the required data, and delivers higher accuracy for unseen predictions.
Operator learning faces several key challenges that hinder its broader adoption in scientific computing.
-
•
A major issue is generalization, as models often struggle to extrapolate beyond training data, especially in highly nonlinear systems with complex geometries.
-
•
Capturing high-frequency features (discontinuous solution fields) remains challenging. In many industrial applications, one expects discontinuities in the solution fields which produce high gradients and therefore very relevant for design purposes (e.g. consider stress or flux values coming from derivation of the primary field such as displacement, or temperature).
-
•
Data efficiency is a concern, as training requires large datasets from expensive simulations or experimental measurements. On that note, many operator-learning frameworks also lack physical constraints, leading to solutions that may violate fundamental laws.
-
•
Scalability to large 3D problems is computationally expensive, and meta-learning techniques for adaptability are still underexplored. This also includes extending their applicability to irregular geometries or domains with complex topologies.
Implicit Neural Representations Implicit Neural Representations (INRs), also known as neural fields, represent a novel approach for encoding data as continuous functions parameterized by neural networks. Unlike traditional data storage methods, such as grids, meshes, or point clouds, INRs implicitly map input coordinates (e.g., spatial or temporal points) to corresponding output values (e.g., deformation, temperature, or color). This method enables efficient and flexible modeling of high-dimensional and complex data. Consequently, the application of INRs in scientific computing has been receiving increasing attention.
The literature and research on neural fields for scientific machine learning, particularly in computational mechanics, have gained momentum in recent years. Serrano et al. (2023) introduced a novel method that uses coordinate-based networks to solve PDEs on general geometries, removing input mesh constraints and excelling in diverse problem domains such as spatio-temporal forecasting and geometric design. Naour et al. (2024) proposed a continuous-time modeling approach for time series forecasting, leveraging implicit neural representations and meta-learning. Boudec et al. (2024)introduced a physics-informed iterative solver that learns to condition gradient descent for solving parametric PDEs, and improving optimization stability, accelerating convergence. See also contributions by Yeom et al. (2025) for achieving speed-up in training neural fields via weight scaling of sinusoidal neural fields. Hagnberger et al. (2024) proposed vectorized conditional neural fields to solve time-dependent PDEs, that have efficient inference, zero-shot super-resolution, and generalization to unseen PDE parameters through attention-based neural fields. Du et al. (2024a) discussed the conditional neural field latent diffusion model which is a generative deep learning framework that enables efficient and probabilistic simulation of complex spatiotemporal turbulence in 3D irregular domains, leveraging conditional neural fields and latent diffusion. Dupont et al. (2022a) proposed a neural compression framework that handles diverse data modalities by converting data into implicit neural representations. Their approach reduces encoding time by two orders of magnitude. Catalani et al. (2024) developed a methodology using INRs to learn surrogate models for steady-state fluid dynamics on unstructured domains, handling 3D geometric variations and generalizing to unseen shapes.
Despite these strengths, INRs face several challenges. Training INRs can be computationally expensive, requiring large amounts of data and careful regularization to prevent overfitting. Generalization remains a significant challenge, although meta-learning and transfer-learning approaches are emerging to address this limitation. Additionally, while INRs excel in representing continuous fields, their scalability to large-scale or high-dimensional data remains constrained by the capacity of the neural network and the computational cost of training. Representing disconnected or complex topologies without artifacts can also be challenging. Addressing these challenges will be critical for further advancing the utility and adoption of INRs in real-world applications.
Summary, open questions and our contributions. Traditional numerical solvers are highly optimized for solving challenging PDE problems. However, they are typically designed for one-time use, requiring costly recomputation whenever input parameters change. Deep learning methods, particularly neural operators, offer a promising alternative but are often highly data-demanding. Integrating physical constraints into neural operators has shown potential in mitigating these challenges, yet training such complex networks remains difficult, especially across a wide range of input parameters. Furthermore, the application of neural operators to real-world problems is still underdeveloped, with key challenges including scalability to complex geometries, accuracy in capturing sharp gradients, and efficient training, all of which remain active areas of research.
Conditional neural fields represent a promising yet straightforward network architecture that compactly and continuously maps function spaces across arbitrary domains. They offer significant flexibility for operator learning in parametrized PDEs, as their outputs can be sampled at arbitrary resolutions and modified by adjusting a set of latent variables. In contrast to discrete networks, where memory usage increases inefficiently with spatio-temporal resolution, neural fields require memory that scales primarily with the number of parameters in the network. Additionally, specialized architectures, such as sinusoidal activations or Fourier features, enhance their ability to capture high-frequency details, while integration with numerical methods extends their applicability to complex geometries and topologies. This work aims to address some of the above challenges by integrating concepts from the standard finite element method and conditional neural fields into operator learning. As a result, we introduce a physics-informed encoder-decoder network to establish the mapping between continuous parameter and solution spaces for parametrized PDEs. To the best of our knowledge, this is the first application of a single-network conditional neural field for operator learning, owing this capability to its unique loss formulation, which eliminates the need for the complex encode-process-decode pipeline (Serrano et al., 2023; Catalani et al., 2024). This approach is well-suited for complex geometries, offering data efficiency, requiring no labeled data, and demonstrating scalability to real-world industry applications, especially in the fields of materials engineering and science.
In Section 2, we summarize the main problem formulation and the types of PDEs and geometries chosen for this study. Next, in Section 3, we introduce implicit Finite Operator Learning as a promising approach for near-real-world problems. In Section 4, we present the study results, followed by conclusions and future directions in Section 5.
2 Parameterized PDEs
Let be the solution field representing physical quantities such as temperature, displacement, velocity, or pressure. The solution field depends on the time , the spatial coordinates , and the control parameter . The solution field is parameterized by while satisfying the following PDE:
(1) | ||||||
Here, denotes the initial condition, and can be either linear or nonlinear with respect to both the solution and the control parameter, and it typically involves partial derivatives of with respect to .
The control parameter can influence the solution field in various ways, such as defining the initial and boundary conditions, material properties, or other system characteristics. In more complex scenarios, may represent a spatially distributed field, accounting for variations in geometry or domain (e.g. material) heterogeneity.
Standard numerical methods, such as the finite element and finite volume methods, are widely used for solving Eq. 1. In practical applications and design processes, they are repeatedly executed for each variation in the control parameter . These methods yield a numerical approximation for varying .
After applying a spatial discretization, such as the finite element method, we approximate the solution field as:
(2) |
where are spatial basis functions, are the time-dependent, parameterized coefficients representing the discrete solution at spatial degrees of freedom, and is the number of grid points. Substituting this approximation into the governing PDE and applying the Galerkin projection, we obtain the semi-discrete system:
(3) |
where:
-
•
are the vectors of unknowns (discrete solutions at spatial nodes) and nodal residuals, respectively,
-
•
is the mass matrix, given by
(4) -
•
represents the nonlinear stiffness, often resulting from the spatial derivatives of . It may be dependent on both the solution as well as the problem parameters.
-
•
is the discrete source term,
-
•
is the control vector that parametrizes the discrete system by governing any or a combination of initial and boundary conditions, material properties, domain geometry, and spatial heterogeneity.
This semi-discrete system is an ordinary differential equation (ODE) system in time, which must be further discretized using a time-stepping method. In this work, the implicit Euler method is used to approximate the time-dependent term in Eq. 3 with a chosen time step size . Table 1 outlines the parametrized boundary value problems explored in this research for operator learning. These problems are selected from a wide range of applications, particularly in computational mechanics. The upper part of the table focuses on stationary problems, where the given PDE has no transient term or time evolution, while the lower part highlights two well-known nonlinear transient equations. For clarity, the FEM-based residual vector for each problem is specified in Table 3.
2D,3D Steady-state problems | Mechanical problem hyper/linear elasticity | Nonlinear diffusion |
PDE | ||
Operator : In Out | , | |
Geometry and BCs |
![]() |
![]() |
Input Samples |
![]() |
![]() |
2D Transient problems | Non-linear thermal diffusion | Allen-Cahn equation |
PDE | ||
Operator , In Out | ||
Geometry and BCs |
![]() |
![]() |
Input Samples |
![]() |
![]() |
3 iFOL: implicit Finite Operator Learning
3.1 Implicit Neural Representations
Implicit Neural Representations (INRs) are multi-layer perceptron (MLP) networks that are coordinate-based and parameterized by layers of weights , biases , and nonlinear activation functions , with . These networks model spatial fields as an implicit function that maps spatial coordinates to scalar or vector quantities . We adopt SIREN (Sitzmann et al., 2020) as the core INR architecture in our framework. This network utilizes sine activation functions, combined with a distinct initialization strategy.
(5) |
Here, , and represent the hidden activations at each layer of the network. The parameter is a hyperparameter that governs the frequency bandwidth of the network. SIREN requires a specialized initialization of weights to ensure that outputs across layers adhere to a standard normal distribution.
3.2 Conditional Neural Fields
Our goal is to obtain a plausible solution to a parameterized partial differential equation using neural fields. This is achieved by conditioning the neural field on a set of latent variables , which can encode the solution field across arbitrary parameterizations and discretization of the underlying PDEs. By varying these latent variables, we can effectively modulate the neural solution field. is typically a low-dimensional vector, and is also referred to as a feature code. For a comprehensive discussion and review of conditional neural fields, interested readers are referred to Xie et al. (2022).
iFOL utilizes Feature-wise Linear Modulation (FiLM Perez et al. (2018)), which conditions neural field in an auto-decoding manner. It employs a simple neural network without hidden layers (i.e., a linear transformation) to predict a shift vector from the latent variables for each layer of the neural field network. This yields the shift-modulated SIREN:
(6) | ||||
where is the neural solution field, and represent the trainable parameters of the SIREN network, referred to as the Synthesizer, and the FiLM hypernetworks, referred to as the Modulator, respectively.
3.3 PDE Loss Function
Building on physics-informed neural networks, we introduce a domain-integrated physical loss function whose variation with respect to the solution field yields the residuals of the governing partial differential equations. Among the functionals commonly used in computational mechanics, the total potential energy functional and the weighted residual functional naturally fulfill this property. Since the energy functional is physics-specific, not always explicitly known, and its derivation can be challenging, we adopt the well-established weighted residual functional as follows:
(7) |
This formulation weights the residual using the neural field as a test function, enforcing its vanishing in an integral sense. Applying the chain rule and variational calculus, the gradient of the loss function with respect to the predicted solution is given by:
(8) | ||||
where is zero due to the stationarity of the residual functional with respect to the solution field. To compute the loss function efficiently, we discretize the domain and employ the corresponding discrete residuals (Eq. 3) as follows:
(9) |
where represents the neural solution field vector evaluated at the mesh points, and the superscript indicates that the quantity is evaluated at the element level. Here note that, although the PDE loss presented here is based on the residual, one can also directly utilize the energy functional of a PDE as shown in Table 3 in Appendix A.4 as well. In fact, the energy functional-based loss in Table 3 was employed for the transient problems.
This strategy enables the seamless integration of well-established numerical methods into the learning process while minimizing computational overhead. Consequently, it allows for the construction of PINNs without relying on resource-intensive automatic differentiation to evaluate the components of the PDE.
3.4 Training
The training and inference of the conditional neural fields (CNFs) involve the computation of the latent variables , a step commonly referred to as encoding. In data-driven operator learning methodologies utilizing CNFs, encoding is performed on both the input and output fields, following the so-called Encode-Process-Decode framework(Dupont et al., 2022b; Yin et al., 2022; Serrano et al., 2023; Catalani et al., 2024). In contrast, this work uniquely encodes the PDE and the underlying physics, rather than the spatial fields. In each training step of iFOL, the latent codes for the sample batch are first derived by minimizing the physical loss with respect to the latent codes in just a few steps of gradient descent:
(10) | ||||
Subsequently, the parameters of the Synthesizer and Modulator networks are optimized using the computed latent codes:
(11) |
Basically, we partition the model into context-specific parameters, which dynamically adapt to individual samples, and meta-trained parameters, which are globally optimized to enable knowledge transfer across diverse contexts. To the best of our knowledge, this work presents the first application of second-order meta-learning, inspired by the CAVIA algorithm (Zintgraf et al., 2019), for the parametric learning of PDEs in a physics-informed manner. The details of the approach are provided in Algorithms 1 and 2. Here, denotes the encoding learning rate, while is the training learning rate, which adjusts the weights of the modulator and synthesizer networks.
To provide further clarity on the architecture presented above, a comparative analysis between the proposed approach and two well-established operator learning algorithms is shown in Fig. 1. Furthermore, a more detailed description of each component of the iFOL framework is provided in Fig. 2. The upper section of the figure outlines the approach for stationary problems, while the lower section demonstrates the application of the same architecture to transient problems. Notably, once trained on transient problems, the network can be repeatedly invoked to generate solutions over time, without the need for additional retraining.


4 Results
4.1 Stationary problems: hyper-elastic mechanical equilibrium
We begin by evaluating the performance of iFOL in predicting the mechanical deformation of a hyperelastic heterogeneous solid when the property distribution varies, i.e., the operator maps . The strong form and energy formulation of the solid in this context follows a nonlinear form, which is detailed in Tables 1 and 3 and Section A.1, along with the selected material properties.

The random training samples are generated using a Fourier-based parametrization with and (See also Rezaei et al. (2025b) for details on this parameterization scheme). The chosen network parameters are listed in Table 4.
In Fig. 3, the results of iFOL are shown for four unseen test cases, all evaluated at more than twice the training resolution (i.e., approximately six times more grid points). The first row presents predictions for unseen higher-frequency components as well as low-frequency ones with unsymmetrical property distribution. In the second row, we further challenge the network by testing its performance on polycrystalline microstructures, which go well beyond the training samples used in this study. Notably, in the last case featuring a multiphase polycrystal, we even alter the material property values (e.g., Young’s modulus) to include ranges not encountered during training.
Despite significant changes in the input space and resolution, the network produces reasonable predictions. Across all test cases, the errors remain well below one order of magnitude compared to the applied displacement and maximum deformation in the solid.
At this point, a valid question is how we can ensure the quantity and quality of the initial training samples to perform a certain task. Although a solid and unique answer to this question requires much further intensive study, as it is influenced by many parameters, we attempt to address it here by altering the number of training samples while keeping the network hyperparameters the same. As expected and shown in Fig. 4, increasing the number of samples systematically reduces the errors up to a certain level. Interestingly, the entire framework appears to be very sample- or data-efficient, as extensive samples are not required to achieve reasonable results. The same pattern repeats in other case studies, but we omit them for the sake of brevity.

4.2 Stationary problems: elasticity and learning on boundary conditions
In this study, we examine the performance of iFOL in learning the solution for given BCs while keeping material properties and geometry fixed. More details on the chosen boundary values are provided in Section A.1, while details on the formulation of the loss term can be found in Tables 1 and 3. The selected network hyperparameters are listed in Table 4. Here, we define the operator as . We focus on gyroid surfaces, a specific class of metamaterials. The complex topology of the gyroid enables novel functionalities, which highlights its importance in the design of next-generation materials. Moreover, we consider applications in multiscale analysis, where the applied boundary conditions typically originate from the macroscale, and it is crucial to determine the microstructural response to the imposed macroscopic displacement field.
For this purpose, we generate 1000 random samples for the applied displacement vector on the front surface of the chosen metamaterial and train iFOL to learn the full deformation field for a given Dirichlet BC. In Fig. 5, the results of iFOL versus FEM are shown for three different test cases. For visualization purposes, the deformations are magnified by a factor of 10. The maximum pointwise errors for unseen test cases are one order of magnitude lower than the applied displacement field and are mainly concentrated in specific regions where Neumann BCs are applied. It is worth noting that, within the iFOL framework, Dirichlet BCs are enforced in a hard manner, which explains the zero error at the front and back surfaces.

4.3 Stationary problems: Nonlinear diffusion
In this study, we not only critically assess the applicability of iFOL in approximating nonlinear operators on complex geometries with irregular meshes but also evaluate its effectiveness for sensitivity analysis. We are specifically interested in the derivatives of domain-integrated functionals that explicitly depend on the solution field (the operator’s output) with respect to the parameter field (the operator’s input), e.g.,
(12) |
Here, represents the Jacobian of the operator and can be efficiently computed by applying automatic differentiation (AD) to the trained iFOL network as a postprocessing step.

For this example, 8000 samples were used for training, and 2000 samples were reserved for testing as unseen data. The conductivity field is parametrized using the Fourier parametrization as introduced in (Rezaei et al., 2024), with , , and . The so-called sigmoidal projection was applied to generate dual-phase conductivity fields within the range of 0.5 to 1.0. For further details, refer to the corresponding column in Table 1. A discussion and details on the network architecture are provided in Section B and Table 4. For the summary of the mathematical formulation see also Section A.2.
Figure 6 showcases how iFOL performs in predicting temperature fields for unseen conductivity distributions compared to the reference finite element method. The complicated 3D geometry of FOL is discretized using a fully unstructured tetrahedral mesh. The input parameter exhibits a complex spatial distribution, while the temperature field is governed by a thermal diffusion PDE with high nonlinearity arising from the temperature-dependent conductivity field. iFOL and reference FE solutions are presented for two samples that were not included in the training set. We observe that the learned operator exhibits generalization both to the parameter field and to the evaluation grid, which is 7× finer than the training mesh.
Figure 7 displays the sensitivity maps of for two unseen samples, where one map is obtained via automatic differentiation applied to iFOL’s inference function, and the other is computed using adjoint-based sensitivity analysis. Qualitative comparisons show a high degree of agreement between the two maps, with both displaying similar spatial sensitivity patterns. To the authors’ knowledge, this is the first time that the Jacobian of neural network-based operators for spatially distributed input and output fields has been evaluated and compared against classical sensitivity analysis techniques, such as adjoint methods. Based on the results presented here, we can conclude that iFOL is not only able to accurately approximate PDE-based operators in a mesh- and geometry-agnostic manner, but also to reliably estimate the operator’s Jacobian.

4.4 Transient problems: nonlinear thermal diffusion in heterogeneous domain
As mentioned before, for transient problems, we adopt a slightly different strategy, where we train the network to learn the dynamics of the transient problem from step to . Upon successful training, the trained network can then be called multiple times in a loop to predict the evolution of the desired solution field. In this section, we present the results for nonlinear thermal diffusion in a heterogeneous domain where we have a map between and . The heterogeneity is intentionally introduced to create sharp solution jumps, which typically correspond to higher frequencies in the solution and are usually not easily captured by a naive network setup. From a physical perspective, this can be related to different phases in a material microstructural system. The additional nonlinearity arises from the fact that the material conductivity is not only a function of space but also varies with temperature (i.e., the solution field). See also Section A.2 for the summary of the mathematical formulation and further details on the material properties. The results of this study are summarized in Fig. 8.

Here, the training fields are randomly generated using the Gaussian process with varying length scales of the radial basis function kernel, and the results of the temperature profile evolution for different unseen initial conditions are provided. At this point, the resolution for both training and testing is the same (a grid of ). Note that any obtained temperature profile as an output serves as a new input for the next step. Therefore, for transient problems, errors can accumulate over time, which has also been reported in previous studies Du et al. (2024b); Dingreville et al. (2024); Yamazaki et al. (2025b). The network’s capability to handle unseen temperature fields is also worth mentioning. For example, from the fifth time step onward, the temperature profile is primarily dominated by BCs and does not resemble the randomly generated fields process used to train the network. Yet, the predictions remain reasonably accurate. We emphasize that the predictions of up to 50 steps shown in this figure are purely based on the one-time-trained network, with no labeled data or hybrid solver used to further support the network.
As a final remark, we should point out our fundamentally different approach to handling transient problems, which involves breaking the time axis as explained. In the classical PINN approach, the time variable is yet another input in addition to space. Although this approach initially makes sense and appears natural, training with respect to both time and space can become very challenging for high-dimensional problems, as the network can simply violate the so-called causality, requiring additional enhancements that are not needed here Wang et al. (2024).
4.5 Transient problems: Allen-Cahn equation on the complex domain
In this section, we present the results for the nonlinear Allen-Cahn equation in a homogeneous irregular domain, where we establish a mapping between and . See also Section A.3 for the summary of the mathematical formulation. The irregular domain is intentionally chosen to demonstrate the approach’s capability to handle complex geometries. From a practical perspective, such shapes are commonly observed in active material particles on the cathode side of battery systems, where phase transitions in the material govern ion diffusion Chen et al. (2024b). As before, training samples are generated from the Gaussian process to train the network. The results in Fig. 9 illustrate two different initializations, leading to distinct final phase profiles. The same strategy for the nonlinear thermal problem applies here, so we do not repeat every detail. Predictions remain accurate up to 10 time steps, after which errors gradually accumulate, a point that will be analyzed further. It is important to note that the resolution for both training and testing is kept the same, as it was found to be sufficient based on the length scale parameter in the phase field equation. Further discussion on super-resolution capabilities is provided in Section 4.6. We did not leverage any precomputed solution fields or hybrid solver coupling. However, such techniques could be incorporated in future developments to further reduce errors and maintain accuracy within expected margins. See investigations by Li et al. (2023c); Dingreville et al. (2024).
4.6 Studies on zero-shot super resolution
Zero-shot super-resolution (ZSSR) in operator learning refers to the ability of a trained model to enhance the resolution of function mappings without requiring additional high-resolution training data. Unlike traditional super-resolution techniques that rely on supervised learning with paired low- and high-resolution samples, zero-shot approaches leverage the inherent structure of the learned operator to generalize across different resolutions Li et al. (2021); Koopas et al. (2024).
ZSSR is particularly useful in scientific computing, where performing calculations on very fine discretizations is computationally expensive. It is worth noting the similarities to model order reduction techniques, where the dimensionality of the problem is reduced by capturing the dominant solution modes Quarteroni and Rozza (2011).
The ZSSR capability of iFOL is also demonstrated in Figures 3 and 6. Here, we further validate this claim by systematically testing the proposed iFOL approach and quantitatively assessing the error accumulation across different mesh resolutions. Our main focus is on stationary nonlinear diffusion in 3D, where we evaluate the solution on both finer and coarser grids, as well as on the 2D nonlinear Allen-Cahn equation, where we analyze error accumulation over time and for varying resolutions. Similar behavior is observed for other investigated problems, and therefore, for brevity, we do not report all details.
In Fig.10, we evaluate the network on two additional mesh resolutions. Since we use unstructured meshes and nodes in the 3D domain, we aim to assess whether the network can generate results for both coarser and finer meshes. For error measurement in this particular example, we adopt a strict criterion based on the maximum pointwise error. To ensure the results are not case-dependent, we systematically test multiple samples, with five representative cases shown for each resolution in Fig.10. The averaged values are highlighted in red.

As expected, the prediction error increases when moving to resolutions different from the training resolution. However, as demonstrated in Fig. 6, the overall results remain acceptable. A potential way to mitigate this issue is to train the network on multiple resolutions across different samples, reducing bias toward a specific mesh resolution.


In Fig. 11, we analyze the transient Allen-Cahn problem and compare results using relative L2 error measurements across three different resolutions. On the far left, we show the trained resolution, followed by two additional cases with two and four times more nodes, respectively. As expected, and similar to other physical problems, the error increases slightly when moving to different resolutions, particularly finer ones. Moreover, this diagram illustrates how errors accumulate over time for a given resolution (observe the spacing between circles, squares, and triangles along the y-axis). A potential remedy is to train the model using multiple resolutions.

4.7 Computational cost analysis
Accuracy and computational cost are two key criteria for comparing classical techniques, such as the FEM and adjoint-based sensitivity analysis, with machine learning-based approaches, including PINNs and operator learning methods. In this section, we analyze the computational cost of the iFOL technique using the 3D nonlinear diffusion problem described in Section 4.3. Leveraging Google JAX Bradbury et al. (2018), we developed a unified framework that seamlessly integrates classical solver operations—such as element-wise computations, global Jacobian assembly, and right-hand side (RHS) construction—with machine learning techniques, including loss function evaluation and gradient-based optimization. This framework is implemented within the FOL paradigm Rezaei et al. (2024), utilizing key JAX features such as jax.vmap for efficient vectorized operations and jax.jit for just-in-time (JIT) compilation, enabling the translation of computationally demanding tasks into optimized machine code. The loss function is implemented exactly as formulated in Eq. 9. Furthermore, the physical loss function not only returns the element-wise weighted residual or its energy counterpart as a scalar loss function to be minimized during the learning process but also provides the corresponding elemental Jacobian and RHS. This unified implementation enables the same framework to be used for both FEM solvers and parameterized learning approaches such as iFOL.
Table 2 summarizes the computational times for the primal and sensitivity analyses performed using iFOL and FEM. or the sake of brevity, we focus on the example reported in Section 4.3, where we deal with a nonlinear PDE in a 3D complex geometry. A similar trend of results was also observed for other cases. We note that training iFOL on 8000 samples with 3800 iterations required 14 hours on NVIDIA A100 GPU with 40 GB of RAM (see Table 4). However, during inference, iFOL achieves speedups of 200×, 1000×, and 8300× compared to FEM on coarse (1069 nodes), normal (5485 nodes), and fine (37362 nodes) meshes, respectively. Beyond the speedup in the primal inference, iFOL also achieved 1.3× and 13× speedups for sensitivity analysis on normal and fine meshes, respectively. It should be noted that the reported times for AD-iFOL and Adjoint-FEM correspond exclusively to sensitivity analysis, assuming the primal solution is already available. Furthermore, the computational cost of iFOL-based sensitivity analysis remains independent of the number of response functions. In contrast, adjoint-based sensitivity analysis scales directly with the number of responses, as a separate adjoint system of equations must be solved for each response.
Training (h) | Batch size | Inference (ms/sample) | # Iterations/sample | ||||
Mesh res. (# nodes) | 1069 | 5485 | 37362 | ||||
iFOL | 14 | 64 | 1.26 | 1.60 | 3.59 | 3 | |
FEM | - | 64 | 256 | 1599 | 29961 | 10 | |
AD-iFOL | - | 32 | 169 | 173 | 228 | 0 | |
Adjoint-FEM | - | 32 | 175 | 233 | 3152 | 1 |
5 Conclusion
We introduced the implicit Finite Operator Learning (iFOL) technique, a novel framework for solving parametric PDEs on arbitrary geometries. iFOL is a physics-informed automatic encoder-decoder that combines three state-of-the-art techniques: conditional neural fields for spatial encoding, second-order meta-learning for PDE encoding, and a physics-informed loss function inspired by the original FOL concept. This operator learning technique is capable of providing both a precise continuous mapping between parameter and solution spaces and a reliable estimation of the operator’s Jacobian. The computed Jacobian can be directly utilized for sensitivity analysis and gradient-based optimization without incurring additional computational costs.
The applicability of the methodology is tested on several classes of PDEs, with special attention to problems in computational mechanics and materials engineering, namely hyperelasticity, linear elasticity, nonlinear thermal diffusion, and the Allen-Cahn equation. We address stationary and transient problems in 2D and 3D, including both regular and complex geometries. The variety of examples is intended to cover most potential applications. In all cases, iFOL is able to parametrically learn the solution and predict unseen test samples, even beyond the training regime, demonstrating its generalization capability to new scenarios without any retraining.
It is worth highlighting that no labeled data is used throughout this work. Instead, the PDEs in the weighted residual form or in their energy counterparts are directly defined as loss functions. This approach improves efficiency, as no automatic differentiation operator is required to construct the loss terms. Moreover, it allows for the direct integration of residuals from other numerical solvers, such as FEM or spectral methods like FFT. Finally, since the network learns the equation parametrically for a class of problems, the inference time of iFOL can be highly competitive with direct numerical solvers like FEM. The speed-up factor is not only problem-dependent but also resolution-dependent and influenced by implementation details, making it difficult to provide a universal comparison, especially considering differences in hardware setups for DL-based solvers and classical numerical methods. In specific cases and under comparable conditions, we observe at least a 100 speed-up for nonlinear problems, which can further increase for higher resolutions.
Outlook and future research direction Physics-informed neural operators, and more specifically the iFOL framework, offer significant potential for the next generation of numerical solvers by leveraging their ability to learn solutions parametrically. A natural next step is to apply the introduced methodologies to more complex multiphysics problems, where multiple physical processes influence each other through one- or two-way coupling. In such scenarios, classical solvers often struggle and require staggered approaches. Moreover, in this work, we primarily rely on a single input parameter space at a time. A natural extension would be to incorporate multiple input parameter spaces, a direction that has been explored to some extent by other authors, primarily using data-driven operator learning techniques. Finally, since we have access to meaningful and reliable sensitivity information independently of the number of objectives, gradient-based optimization becomes a compelling approach. This capability has significant potential for efficiently solving complex optimization and inverse problems, even in time-dependent settings, while maintaining reasonable computational and implementation costs.
Code Availability
The implementation of iFOL and the examples presented in this study will be made publicly available under the FOL framework repository on GitHub at https://github.com/RezaNajian/FOL, following the publication of this work.
Acknowledgments
The authors would like to thank the Deutsche Forschungsgemeinschaft (DFG) for the funding support provided to develop the present work in the Cluster of Excellence Project ’Internet of Production’ (project: 390621612). The authors also acknowledge the financial support of SFB 1120 B07 - Mehrskalige thermomechanische Simulation der fest-flüssig Interaktionen bei der Erstarrung (B07) (260070971) (260070971). Yusuke Yamazaki acknowledges the financial support from JST SPRING (JPMJSP2123). Mayu Muramatsu acknowledges the financial support from the JSPS KAKENHI Grant (22H01367).
Authors contributions
R.N.A.: Conceptualization, Methodology, Software, Writing - Review & Editing. Y.Y.: Software, Writing - Review & Editing. K.T.: Software. M.M.: Funding, Review & Editing. M.A.: Funding, Review & Editing. S.R.: Supervision, Methodology, Software, Writing - Review & Editing.
Appendix A Review on physical formulations and their discretization using FEM
A.1 Mechanical problem
Here, we summarize the mechanical problem where the position of material points is denoted by . We denote the displacement components by , , and in the , , and directions, respectively. The kinematic relation defines the strain tensor in terms of the deformation vector and reads:
(13) | ||||
(14) |
Here, is the deformation gradient, and is the Jacobian determinant, representing the local volume change. The elastic energy of the solid for linear and nonlinear (hyperelastic Neo-Hookean material) case reads as
(15) | ||||
(16) |
The Lamé constants are denoted by and , and is the trace of the strain tensor, representing the volumetric strain. is the first invariant of the isochoric right Cauchy-Green deformation tensor where is the right Cauchy-Green tensor. We then define the Cauchy stress tensor and the Piola-Kirchhoff stress , which are obtained by differentiating the strain energy function. Here or is the fourth-order elasticity tensor by defining as the second-order identity tensor and as the symmetric fourth-order identity tensor. Finally, the mechanical equilibrium in the absence of body force, as well as the Dirichlet and Neumann boundary conditions, are written as:
(17) | ||||
(18) | ||||
(19) |
In the above relations, and denote the material points in the body and on the boundary area, respectively. Moreover, the Dirichlet and Neumann boundary conditions are introduced in Eq. 18 and Eq. 19, respectively. In the case of hyperelasticity, we then have instead of Eq. 17.
A.2 Thermal problem
The heat equation describes how the temperature , with representing position and the time, evolves within the domain over time. Let the heat source be , the boundary temperature , and the boundary heat flux be , where and are the regions where the Dirichlet and Neumann boundary conditions are applied, respectively. Here, represents the temporal domain, with denoting the end time. The strong form of the heat equation is given as:
(20) | ||||
(21) | ||||
(22) | ||||
(23) |
In the above equations, is the outward normal vector, is the specific heat capacity, and is the density. The heat flux is given by , where represents the thermal conductivity, which depends on both position and temperature.
A.3 Allen-Cahn equation
The Allen-Cahn equation describes the evolution of an order parameter , where represents position and is time, within the domain . Let the source term be , the boundary value be , and the boundary flux be , where and are the regions where the Dirichlet and Neumann boundary conditions are applied, respectively. Here, represents the temporal domain, with denoting the end time. The strong form of the Allen-Cahn equation is given as:
(24) | ||||
(25) | ||||
(26) | ||||
(27) | ||||
(28) |
In the above equations, is the outward normal vector, is the mobility coefficient, and represents the functional derivative of the free energy, typically given by:
(29) |
where is a small parameter controlling the interface thickness and denotes the derivative of the double-well potential enforcing phase separation. The above set of equations forms the basis for the results and studies presented in Section 4.5. The Allen-Cahn equation is a fundamental phase-field model and plays a crucial role in modeling microstructure evolution in alloys Allen and Cahn (1979); Karma and Rappel (1998). The equation captures the dynamics of interfaces and transitions between phases (including applications in phase-field fracture Bourdin et al. (2000); Rezaei et al. (2021)). Its applications extend to image processing, tumor growth modeling, and topology optimization.
A.4 A short review on FEM formulation
Next, we briefly summarize the FEM discretization techniques, particularly for a 2D quadrilateral element, to clarify the loss terms in the iFOL formulation, especially for implementation purposes. The details provided here are fairly standard, and the extension to other types of complex element formulations with different polynomial orders should be straightforward. Readers are also encouraged to see the standard procedure in any finite element subroutine Hughes (2000); Bathe (1996). The corresponding linear shape functions and the deformation matrix used to discretize the mechanical weak form in the current work.
(30) |
The notation and represent the derivatives of the shape function with respect to the coordinates and , respectively. To compute these derivatives, we utilize the Jacobian matrix
(31) |
Here, and represent the physical and parent coordinate systems, respectively. It is worth mentioning that this determinant remains constant for parallelogram-shaped elements, eliminating the need to evaluate this term at each integration point. Finally, for the B matrix, we have
(32) |
For each element the deformation field and stress tensor are approximated as
(33) |
Here, and are the nodal values of the deformation field of element in the and directions, respectively. The same procedure applies to both the thermal problem and the Allen-Cahn equation. However, one must account for additional time derivatives. We summarize the implemented loss terms in Table 3.
PDE type | FEM residual / iFOL loss function |
Stationary nonlinear mechanical , (Sec. 4.1) | |
\hdashline Stationary linear mechanical , (Sec. 4.2) | |
\hdashline Stationary nonlinear thermal , (Sec. 4.3) | |
\hdashline Transient nonlinear thermal , (Sec. 4.4) | |
+ | |
\hdashline Transient Allen-Cahn , (Sec. 4.5) | |
In all the reported studies, we utilized simple first-order shape functions. The element type used in the studies varies based on the application, ranging from structured to unstructured meshes, including quadrilateral and tetrahedral elements, demonstrating the flexibility of the approach in handling problems with complex geometries.
Moreover, in the above set of equations, and represent the number of Gaussian integration points and number of elements, respectively. and denote the coordinates and weight of the -th integration point. The determinant of the Jacobian matrix is denoted by . Readers are encouraged to refer to Bathe (1996); Hughes (2000) as well as Rezaei et al. (2025b); Yamazaki et al. (2025b) and the references therein for more details on the FE formulation.
Appendix B Choice of Hyperparameters
A comprehensive summary of the network’s hyperparameters and configurations for all the models introduced above, as well as the various analyses conducted, is provided in Tables 4 and 5. NVIDIA Quadro RTX 6000s with 24 GB of RAM and NVIDIA A100 with 40 GB of RAM were utilized for the training of the stationary problems. The studies on the transient problems were performed on NVIDIA GeForce RTX 4090 with 24 GB of RAM.
We conducted extensive studies on multiple parameters, including latent size, learning rate, neural network structure, and the frequency of sinusoidal functions, among others. The values reported in Tables 4 and 5 correspond to the best-performing configurations. In general, we did not observe significant improvements with frequency parameter other than , which interestingly performed well across various problem classes. Increasing the depth and latent size proved consistently beneficial with the sinusoidal activation function used in this study, though it comes with a higher training cost. Depending on problem complexity, reducing the number of epochs and latent iterations can further accelerate both training and inference in the proposed iFOL method. As discussed in the results section, having a larger number of samples also leads to better performance. On the other hand, the training cost may increase depending on the chosen batch size, and one must adapt the network architecture and latent size accordingly.
Training parameter | Nonlin. elas. (Sec. 4.1) | Lin. elas. (Sec. 4.2) | Staion. Nonlin. therm. (Sec. 4.3) |
Number of samples | 8000 | 1000 | 8000 |
Grid in training | |||
Grid in evaluation | 37362 | ||
Synthesizers | [64]*4 | [256]*6 | [128]*6 |
30 | 30 | 30 | |
Modulators | Linear (FiLM) | Linear (FiLM) | Linear (FiLM) |
Latent size | 512 | 1024 | 512 |
Number of latent iterations | 3 | 3 | 3 |
Latent/encoding learning rate | |||
Training learning rate | to | ||
Batch size | 320 | 120 | 100 |
Gradient normalization | Yes | Yes | Yes |
Number of epochs | 10000 | 10000 | 3800 |
Optimizer | Adam | Adam | Adam |
Total trainable parameters | 143938 | 330755 | 476417 |
Training parameter | Transient nonlin. thermal (Sec. 4.4) | Transient nonlin. Allen-Cahn (Sec. 4.5) |
Number of samples | 6000 | 8000 |
Grid in training | 2624 | |
Grid in evaluation | , | 2624, 5678, 9873 |
Synthesizers | [256]*6 | [256]*6 |
30 | 30 | |
Modulators | Linear (FiLM) | Linear (FiLM) |
Latent size | 256 | 256 |
Number of latent iterations | 3 | 3 |
Latent/encoding learning rate | ||
Training learning rate | to | to |
Batch size | 120 | 100 |
Gradient normalization | Yes | Yes |
Number of epochs | 10000 | 10000 |
Optimizer | Adam | Adam |
Total trainable parameters | 723457 | 723201 |
References
- Raissi et al. [2019] M. Raissi, P. Perdikaris, and G.E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707, 2019. ISSN 0021-9991. doi:https://doi.org/10.1016/j.jcp.2018.10.045. URL https://www.sciencedirect.com/science/article/pii/S0021999118307125.
- Rezaei et al. [2022] Shahed Rezaei, Ali Harandi, Ahmad Moeineddin, Bai-Xiang Xu, and Stefanie Reese. A mixed formulation for physics-informed neural networks as a potential solver for engineering problems in heterogeneous domains: Comparison with finite element method. Computer Methods in Applied Mechanics and Engineering, 401:115616, 2022. ISSN 0045-7825. doi:https://doi.org/10.1016/j.cma.2022.115616. URL https://www.sciencedirect.com/science/article/pii/S0045782522005722.
- Grossmann et al. [2024] Tamara G Grossmann, Urszula Julia Komorowska, Jonas Latz, and Carola-Bibiane Schönlieb. Can physics-informed neural networks beat the finite element method? IMA Journal of Applied Mathematics, 89(1):143–174, 05 2024. ISSN 0272-4960. doi:10.1093/imamat/hxae011. URL https://doi.org/10.1093/imamat/hxae011.
- Faroughi et al. [2024] Salah A. Faroughi, Nikhil M. Pawar, Célio Fernandes, Maziar Raissi, Subasish Das, Nima K. Kalantari, and Seyed Kourosh Mahjour. Physics-guided, physics-informed, and physics-encoded neural networks and operators in scientific computing: Fluid and solid mechanics. Journal of Computing and Information Science in Engineering, 24(4):1–45, 2024. doi:10.1115/1.4064449. URL https://doi.org/10.1115/1.4064449.
- Mitusch et al. [2021] Sebastian K. Mitusch, Simon W. Funke, and Miroslav Kuchta. Hybrid fem-nn models: Combining artificial neural networks with the finite element method. Journal of Computational Physics, 446:110651, 2021. ISSN 0021-9991. doi:https://doi.org/10.1016/j.jcp.2021.110651. URL https://www.sciencedirect.com/science/article/pii/S0021999121005465.
- Škardová et al. [2024] Kateřina Škardová, Alexandre Daby-Seesaram, and Martin Genet. Finite element neural network interpolation. part i: Interpretable and adaptive discretization for solving pdes. 2024. URL https://arxiv.org/abs/2412.05719.
- Xiong et al. [2025] Wei Xiong, Xiangyun Long, Stéphane P.A. Bordas, and Chao Jiang. The deep finite element method: A deep learning framework integrating the physics-informed neural networks with the finite element method. Computer Methods in Applied Mechanics and Engineering, 436:117681, 2025. ISSN 0045-7825. doi:https://doi.org/10.1016/j.cma.2024.117681. URL https://www.sciencedirect.com/science/article/pii/S0045782524009356.
- Li et al. [2024] Haolin Li, Yuyang Miao, Zahra Sharif Khodaei, and M. H. Aliabadi. Finite-pinn: A physics-informed neural network architecture for solving solid mechanics problems with general geometries. 2024. URL https://arxiv.org/abs/2412.09453.
- Lu et al. [2021] Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3):218–229, 2021. ISSN 2522-5839. doi:10.1038/s42256-021-00302-5. URL https://doi.org/10.1038/s42256-021-00302-5.
- Abueidda et al. [2025] Diab W. Abueidda, Panos Pantidis, and Mostafa E. Mobasher. Deepokan: Deep operator network based on kolmogorov arnold networks for mechanics problems. Computer Methods in Applied Mechanics and Engineering, 436:117699, 2025. ISSN 0045-7825. doi:https://doi.org/10.1016/j.cma.2024.117699. URL https://www.sciencedirect.com/science/article/pii/S0045782524009538.
- He et al. [2024] Junyan He, Seid Koric, Diab Abueidda, Ali Najafi, and Iwona Jasiuk. Geom-deeponet: A point-cloud-based deep operator network for field predictions on 3d parameterized geometries. Computer Methods in Applied Mechanics and Engineering, 429:117130, 2024. ISSN 0045-7825. doi:https://doi.org/10.1016/j.cma.2024.117130. URL https://www.sciencedirect.com/science/article/pii/S0045782524003864.
- Kumar et al. [2025] Varun Kumar, Somdatta Goswami, Katiana Kontolati, Michael D. Shields, and George Em Karniadakis. Synergistic learning with multi-task deeponet for efficient pde problem solving. Neural Networks, 184:107113, 2025. ISSN 0893-6080. doi:https://doi.org/10.1016/j.neunet.2024.107113. URL https://www.sciencedirect.com/science/article/pii/S0893608024010426.
- Yu et al. [2024] Xinling Yu, Sean Hooten, Ziyue Liu, Yequan Zhao, Marco Fiorentino, Thomas Van Vaerenbergh, and Zheng Zhang. Separable operator networks, 2024. URL https://arxiv.org/abs/2407.11253.
- Wang et al. [2021] Sifan Wang, Hanwen Wang, and Paris Perdikaris. Learning the solution operator of parametric partial differential equations with physics-informed deeponets. Science Advances, 7(40):eabi8605, 2021. doi:10.1126/sciadv.abi8605. URL https://www.science.org/doi/abs/10.1126/sciadv.abi8605.
- Mandl et al. [2025] Luis Mandl, Somdatta Goswami, Lena Lambers, and Tim Ricken. Separable physics-informed deeponet: Breaking the curse of dimensionality in physics-informed machine learning. Computer Methods in Applied Mechanics and Engineering, 434:117586, 2025. ISSN 0045-7825. doi:https://doi.org/10.1016/j.cma.2024.117586. URL https://www.sciencedirect.com/science/article/pii/S0045782524008405.
- Li et al. [2025] Haolin Li, Yuyang Miao, Zahra Sharif Khodaei, and M.H. Aliabadi. An architectural analysis of deeponet and a general extension of the physics-informed deeponet model on solving nonlinear parametric partial differential equations. Neurocomputing, 611:128675, 2025. ISSN 0925-2312. doi:https://doi.org/10.1016/j.neucom.2024.128675. URL https://www.sciencedirect.com/science/article/pii/S0925231224014462.
- Li et al. [2021] Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations, 2021. URL https://arxiv.org/abs/2010.08895.
- Azizzadenesheli et al. [2024] Kamyar Azizzadenesheli, Nikola Kovachki, Zongyi Li, Miguel Liu-Schiaffini, Jean Kossaifi, and Anima Anandkumar. Neural operators for accelerating scientific simulations and design. Nature Reviews Physics, 6(5):320–328, 2024. ISSN 2522-5820. doi:10.1038/s42254-024-00712-5. URL https://doi.org/10.1038/s42254-024-00712-5.
- Li et al. [2023a] Zongyi Li, Daniel Zhengyu Huang, Burigede Liu, and Anima Anandkumar. Fourier neural operator with learned deformations for pdes on general geometries. Journal of Machine Learning Research, 24(388):1–26, 2023a. URL http://jmlr.org/papers/v24/23-0064.html.
- Li et al. [2023b] Zongyi Li, Hongkai Zheng, Nikola Kovachki, David Jin, Haoxuan Chen, Burigede Liu, Kamyar Azizzadenesheli, and Anima Anandkumar. Physics-informed neural operator for learning partial differential equations, 2023b. URL https://arxiv.org/abs/2111.03794.
- Ronneberger et al. [2015] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation, 2015. URL https://arxiv.org/abs/1505.04597.
- Chen et al. [2019] Junfeng Chen, Jonathan Viquerat, and Elie Hachem. U-net architectures for fast prediction of incompressible laminar flows, 2019. URL https://arxiv.org/abs/1910.13532.
- Mendizabal et al. [2020] Andrea Mendizabal, Pablo Márquez-Neila, and Stéphane Cotin. Simulation of hyperelastic materials in real-time using deep learning. Medical Image Analysis, 59:101569, 2020. ISSN 1361-8415. doi:https://doi.org/10.1016/j.media.2019.101569. URL https://www.sciencedirect.com/science/article/pii/S1361841519301094.
- Mianroodi et al. [2022] Jaber Rezaei Mianroodi, Shahed Rezaei, Nima H. Siboni, Bai-Xiang Xu, and Dierk Raabe. Lossless multi-scale constitutive elastic relations with artificial intelligence. npj Computational Materials, 8(1):67, April 2022. ISSN 2057-3960. doi:10.1038/s41524-022-00753-3. URL https://doi.org/10.1038/s41524-022-00753-3.
- Gupta et al. [2023] Ashwini Gupta, Anindya Bhaduri, and Lori Graham-Brady. Accelerated multiscale mechanics modeling in a deep learning framework. Mechanics of Materials, 184:104709, 2023. ISSN 0167-6636. doi:https://doi.org/10.1016/j.mechmat.2023.104709. URL https://www.sciencedirect.com/science/article/pii/S0167663623001552.
- Najafi Koopas et al. [2025] Rasoul Najafi Koopas, Shahed Rezaei, Natalie Rauter, Richard Ostwald, and Rolf Lammering. A spatiotemporal deep learning framework for prediction of crack dynamics in heterogeneous solids: Efficient mapping of concrete microstructures to its fracture properties. Engineering Fracture Mechanics, 314:110675, 2025. ISSN 0013-7944. doi:https://doi.org/10.1016/j.engfracmech.2024.110675. URL https://www.sciencedirect.com/science/article/pii/S0013794424008385.
- Yin et al. [2024] Minglang Yin, Nicolas Charon, Ryan Brody, Lu Lu, Natalia Trayanova, and Mauro Maggioni. A scalable framework for learning the geometry-dependent solution operators of partial differential equations. Nature Computational Science, 4(12):928–940, 2024. ISSN 2662-8457. doi:10.1038/s43588-024-00732-2. URL https://doi.org/10.1038/s43588-024-00732-2.
- Xiao et al. [2024] Shanshan Xiao, Pengzhan Jin, and Yifa Tang. A deformation-based framework for learning solution mappings of pdes defined on varying domains, 2024. URL https://arxiv.org/abs/2412.01379.
- Zeng et al. [2025] Chenyu Zeng, Yanshu Zhang, Jiayi Zhou, Yuhan Wang, Zilin Wang, Yuhao Liu, Lei Wu, and Daniel Zhengyu Huang. Point cloud neural operator for parametric pdes on complex and variable geometries, 2025. URL https://arxiv.org/abs/2501.14475.
- Chen et al. [2024a] Gengxiang Chen, Zhi Li, Chao Li, Zhen Li, and Yike Guo. Learning neural operators on riemannian manifolds. National Science Open, 3(1):20240001, 2024a. doi:10.1360/nso/20240001. URL https://www.sciengine.com/NSO/doi/10.1360/nso/20240001.
- Yamazaki et al. [2025a] Yusuke Yamazaki, Ali Harandi, Mayu Muramatsu, Alexandre Viardin, Markus Apel, Tim Brepols, Stefanie Reese, and Shahed Rezaei. A finite element-based physics-informed operator learning framework for spatiotemporal partial differential equations on arbitrary domains. Engineering with Computers, 41(1):1–29, 2025a. ISSN 1435-5663. doi:10.1007/s00366-024-02033-8. URL https://doi.org/10.1007/s00366-024-02033-8.
- Rezaei et al. [2025a] Shahed Rezaei, Reza Najian Asl, Shirko Faroughi, Mahdi Asgharzadeh, Ali Harandi, Rasoul Najafi Koopas, Gottfried Laschet, Stefanie Reese, and Markus Apel. A finite operator learning technique for mapping the elastic properties of microstructures to their mechanical deformations. International Journal for Numerical Methods in Engineering, 126(1):e7637, 2025a. doi:https://doi.org/10.1002/nme.7637. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/nme.7637.
- Xu et al. [2024] Tengfei Xu, Dachuan Liu, Peng Hao, and Bo Wang. Variational operator learning: A unified paradigm marrying training neural operators and solving partial differential equations. Journal of the Mechanics and Physics of Solids, 190:105714, 2024. ISSN 0022-5096. doi:https://doi.org/10.1016/j.jmps.2024.105714. URL https://www.sciencedirect.com/science/article/pii/S0022509624001807.
- Eshaghi et al. [2025] Mohammad Sadegh Eshaghi, Cosmin Anitescu, Manish Thombre, Yizheng Wang, Xiaoying Zhuang, and Timon Rabczuk. Variational physics-informed neural operator (vino) for solving partial differential equations. Computer Methods in Applied Mechanics and Engineering, 437:117785, 2025. ISSN 0045-7825. doi:https://doi.org/10.1016/j.cma.2025.117785. URL https://www.sciencedirect.com/science/article/pii/S004578252500057X.
- Lee et al. [2025] Jae Yong Lee, Seungchan Ko, and Youngjoon Hong. Finite element operator network for solving elliptic-type parametric pdes, 2025. URL https://arxiv.org/abs/2308.04690.
- Kaewnuratchadasorn et al. [2024] Chawit Kaewnuratchadasorn, Jiaji Wang, and Chul-Woo Kim. Physics-informed neural operator solver and super-resolution for solid mechanics. Computer-Aided Civil and Infrastructure Engineering, 39(22):3435–3451, 2024. doi:https://doi.org/10.1111/mice.13292. URL https://onlinelibrary.wiley.com/doi/abs/10.1111/mice.13292.
- Franco et al. [2023] Nicola Rares Franco, Andrea Manzoni, and Paolo Zunino. Mesh-informed neural networks for operator learning in finite element spaces. Journal of Scientific Computing, 97(35), 2023. doi:10.1007/s10915-023-02331-1. URL https://doi.org/10.1007/s10915-023-02331-1.
- Harandi et al. [2025] Ali Harandi, Hooman Danesh, Kevin Linka, Stefanie Reese, and Shahed Rezaei. A spectral-based physics-informed finite operator learning for prediction of mechanical behavior of microstructures, 2025. URL https://arxiv.org/abs/2410.19027.
- Serrano et al. [2023] Louis Serrano, Lise Le Boudec, Armand Kassaï Koupaï, Thomas X Wang, Yuan Yin, Jean-Noël Vittaut, and Patrick Gallinari. Operator learning with neural fields: Tackling pdes on general geometries. Advances in Neural Information Processing Systems, 36:70581–70611, 2023.
- Naour et al. [2024] Etienne Le Naour, Louis Serrano, Léon Migus, Yuan Yin, Ghislain Agoua, Nicolas Baskiotis, Patrick Gallinari, and Vincent Guigue. Time series continuous modeling for imputation and forecasting with implicit neural representations. 2024. URL https://arxiv.org/abs/2306.05880.
- Boudec et al. [2024] Lise Le Boudec, Emmanuel de Bezenac, Louis Serrano, Ramon Daniel Regueiro-Espino, Yuan Yin, and Patrick Gallinari. Learning a neural solver for parametric pde to enhance physics-informed methods. 2024. URL https://arxiv.org/abs/2410.06820.
- Yeom et al. [2025] Taesun Yeom, Sangyoon Lee, and Jaeho Lee. Fast training of sinusoidal neural fields via scaling initialization, 2025. URL https://arxiv.org/abs/2410.04779.
- Hagnberger et al. [2024] Jan Hagnberger, Marimuthu Kalimuthu, Daniel Musekamp, and Mathias Niepert. Vectorized conditional neural fields: A framework for solving time-dependent parametric partial differential equations, 2024. URL https://arxiv.org/abs/2406.03919.
- Du et al. [2024a] Pan Du, Meet Hemant Parikh, Xiantao Fan, Xin-Yang Liu, and Jian-Xun Wang. Conditional neural field latent diffusion model for generating spatiotemporal turbulence. Nature Communications, 15(1):10416, November 2024a. ISSN 2041-1723. doi:10.1038/s41467-024-54712-1. URL https://doi.org/10.1038/s41467-024-54712-1.
- Dupont et al. [2022a] Emilien Dupont, Hrushikesh Loya, Milad Alizadeh, Adam Goliński, Yee Whye Teh, and Arnaud Doucet. Coin++: Neural compression across modalities. arXiv preprint arXiv:2201.12904, 2022a.
- Catalani et al. [2024] Giovanni Catalani, Siddhant Agarwal, Xavier Bertrand, Frédéric Tost, Michael Bauerheim, and Joseph Morlier. Neural fields for rapid aircraft aerodynamics simulations. Scientific Reports, 14(1):25496, 2024.
- Sitzmann et al. [2020] Vincent Sitzmann, Julien Martel, Alexander Bergman, David Lindell, and Gordon Wetzstein. Implicit neural representations with periodic activation functions. Advances in neural information processing systems, 33:7462–7473, 2020.
- Xie et al. [2022] Yiheng Xie, Towaki Takikawa, Shunsuke Saito, Or Litany, Shiqin Yan, Numair Khan, Federico Tombari, James Tompkin, Vincent Sitzmann, and Srinath Sridhar. Neural fields in visual computing and beyond. In Computer Graphics Forum, volume 41, pages 641–676. Wiley Online Library, 2022.
- Perez et al. [2018] Ethan Perez, Florian Strub, Harm De Vries, Vincent Dumoulin, and Aaron Courville. Film: Visual reasoning with a general conditioning layer. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018.
- Dupont et al. [2022b] Emilien Dupont, Hyunjik Kim, SM Eslami, Danilo Rezende, and Dan Rosenbaum. From data to functa: Your data point is a function and you can treat it like one. arXiv preprint arXiv:2201.12204, 2022b.
- Yin et al. [2022] Yuan Yin, Matthieu Kirchmeyer, Jean-Yves Franceschi, Alain Rakotomamonjy, and Patrick Gallinari. Continuous pde dynamics forecasting with implicit neural representations. arXiv preprint arXiv:2209.14855, 2022.
- Zintgraf et al. [2019] Luisa Zintgraf, Kyriacos Shiarli, Vitaly Kurin, Katja Hofmann, and Shimon Whiteson. Fast context adaptation via meta-learning. In International conference on machine learning, pages 7693–7702. PMLR, 2019.
- Rezaei et al. [2025b] Shahed Rezaei, Reza Najian Asl, Shirko Faroughi, Mahdi Asgharzadeh, Ali Harandi, Rasoul Najafi Koopas, Gottfried Laschet, Stefanie Reese, and Markus Apel. A finite operator learning technique for mapping the elastic properties of microstructures to their mechanical deformations. International Journal for Numerical Methods in Engineering, 126(1):e7637, 2025b.
- Rezaei et al. [2024] Shahed Rezaei, Reza Najian Asl, Kianoosh Taghikhani, Ahmad Moeineddin, Michael Kaliske, and Markus Apel. Finite operator learning: Bridging neural operators and numerical methods for efficient parametric solution and optimization of pdes. arXiv preprint arXiv:2407.04157, 2024.
- Du et al. [2024b] Pan Du, Meet Hemant Parikh, Xiantao Fan, Xin-Yang Liu, and Jian-Xun Wang. Conditional neural field latent diffusion model for generating spatiotemporal turbulence. Nature Communications, 15(1):10416, 2024b.
- Dingreville et al. [2024] Rémi Dingreville, Pablo Seleson, Nathaniel Trask, Mitchell Wood, and Donghyun You. Rethinking materials simulations: Blending direct numerical simulations with neural operators. npj Computational Materials, 10(1):124, 2024. doi:10.1038/s41524-024-01319-1. URL https://www.nature.com/articles/s41524-024-01319-1.
- Yamazaki et al. [2025b] Yusuke Yamazaki, Ali Harandi, Mayu Muramatsu, Alexandre Viardin, Markus Apel, Tim Brepols, Stefanie Reese, and Shahed Rezaei. A finite element-based physics-informed operator learning framework for spatiotemporal partial differential equations on arbitrary domains. Engineering with Computers, 41(1):1–29, 2025b.
- Wang et al. [2024] Sifan Wang, Shyam Sankaran, and Paris Perdikaris. Respecting causality for training physics-informed neural networks. Computer Methods in Applied Mechanics and Engineering, 421:116813, 2024. ISSN 0045-7825. doi:https://doi.org/10.1016/j.cma.2024.116813. URL https://www.sciencedirect.com/science/article/pii/S0045782524000690.
- Chen et al. [2024b] Wan-Xin Chen, Jeffery M. Allen, Shahed Rezaei, Orkun Furat, Volker Schmidt, Avtar Singh, Peter J. Weddle, Kandler Smith, and Bai-Xiang Xu. Cohesive phase-field chemo-mechanical simulations of inter- and trans- granular fractures in polycrystalline nmc cathodes via image-based 3d reconstruction. Journal of Power Sources, 596:234054, 2024b. ISSN 0378-7753. doi:https://doi.org/10.1016/j.jpowsour.2024.234054. URL https://www.sciencedirect.com/science/article/pii/S0378775324000053.
- Li et al. [2023c] Wei Li, Martin Z. Bazant, and Juner Zhu. Phase-field deeponet: Physics-informed deep operator neural network for fast simulations of pattern formation governed by gradient flows of free-energy functionals. Computer Methods in Applied Mechanics and Engineering, 416:116299, 2023c. ISSN 0045-7825. doi:https://doi.org/10.1016/j.cma.2023.116299. URL https://www.sciencedirect.com/science/article/pii/S0045782523004231.
- Koopas et al. [2024] Rasoul Najafi Koopas, Shahed Rezaei, Natalie Rauter, Richard Ostwald, and Rolf Lammering. Introducing a microstructure-embedded autoencoder approach for reconstructing high-resolution solution field data from a reduced parametric space. Computational Mechanics, November 2024. doi:10.1007/s00466-024-02568-z. URL https://link.springer.com/article/10.1007/s00466-024-02568-z.
- Quarteroni and Rozza [2011] Alfio Quarteroni and Gianluigi Rozza. Reduced order methods for modeling and computational reduction. Archives of Computational Methods in Engineering, 17(3):149–159, 2011. doi:10.1007/s11831-011-9064-7.
- Bradbury et al. [2018] James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. Jax: Composable transformations of python+numpy programs. GitHub repository, 2018. URL https://github.com/google/jax.
- Allen and Cahn [1979] Samuel M. Allen and John W. Cahn. A microscopic theory for antiphase boundary motion and its application to antiphase domain coarsening. Acta Metallurgica, 27(6):1085–1095, 1979. ISSN 0001-6160. doi:https://doi.org/10.1016/0001-6160(79)90196-2. URL https://www.sciencedirect.com/science/article/pii/0001616079901962.
- Karma and Rappel [1998] Alain Karma and Wouter-Jan Rappel. Quantitative phase-field modeling of dendritic growth in two and three dimensions. Phys. Rev. E, 57:4323–4349, Apr 1998. doi:10.1103/PhysRevE.57.4323. URL https://link.aps.org/doi/10.1103/PhysRevE.57.4323.
- Bourdin et al. [2000] B. Bourdin, G.A. Francfort, and J-J. Marigo. Numerical experiments in revisited brittle fracture. Journal of the Mechanics and Physics of Solids, 48(4):797–826, 2000. ISSN 0022-5096. doi:https://doi.org/10.1016/S0022-5096(99)00028-9. URL https://www.sciencedirect.com/science/article/pii/S0022509699000289.
- Rezaei et al. [2021] Shahed Rezaei, Jaber Rezaei Mianroodi, Tim Brepols, and Stefanie Reese. Direction-dependent fracture in solids: Atomistically calibrated phase-field and cohesive zone model. Journal of the Mechanics and Physics of Solids, 147:104253, 2021. ISSN 0022-5096. doi:https://doi.org/10.1016/j.jmps.2020.104253. URL https://www.sciencedirect.com/science/article/pii/S0022509620304634.
- Hughes [2000] Thomas J. R. Hughes. The Finite Element Method: Linear Static and Dynamic Finite Element Analysis. Dover Publications, 2000.
- Bathe [1996] Klaus-Jürgen Bathe. Finite Element Procedures. Prentice Hall, 1996.