MFlowCode
diff --git a/‎documentation/doxygen_crawl.html
Lines changed: 27 additions & 19 deletions b/‎documentation/doxygen_crawl.html
Lines changed: 27 additions & 19 deletions
diff --git a/‎documentation/md_gpuParallelization.html
Lines changed: 723 additions & 0 deletions b/‎documentation/md_gpuParallelization.html
Lines changed: 723 additions & 0 deletions
diff --git a/‎documentation/md_papers.html
Lines changed: 1 addition & 1 deletion b/‎documentation/md_papers.html
Lines changed: 1 addition & 1 deletion
diff --git a/‎documentation/md_readme.html
Lines changed: 3 additions & 3 deletions b/‎documentation/md_readme.html
Lines changed: 3 additions & 3 deletions
diff --git a/‎documentation/md_references.html
Lines changed: 1 addition & 1 deletion b/‎documentation/md_references.html
Lines changed: 1 addition & 1 deletion
diff --git a/‎documentation/md_running.html
Lines changed: 8 additions & 8 deletions b/‎documentation/md_running.html
Lines changed: 8 additions & 8 deletions
diff --git a/‎documentation/md_testing.html
Lines changed: 3 additions & 3 deletions b/‎documentation/md_testing.html
Lines changed: 3 additions & 3 deletions
@@ -115,31 +115,39 @@
 <a href="md_getting-started.html#autotoc_md97"/>
 <a href="md_getting-started.html#autotoc_md98"/>
 <a href="md_getting-started.html#autotoc_md99"/>
+<a href="md_gpuParallelization.html"/>
+<a href="md_gpuParallelization.html#autotoc_md104"/>
+<a href="md_gpuParallelization.html#autotoc_md105"/>
+<a href="md_gpuParallelization.html#autotoc_md106"/>
+<a href="md_gpuParallelization.html#autotoc_md108"/>
+<a href="md_gpuParallelization.html#autotoc_md110"/>
+<a href="md_gpuParallelization.html#autotoc_md112"/>
+<a href="md_gpuParallelization.html#autotoc_md114"/>
 <a href="md_papers.html"/>
 <a href="md_readme.html"/>
-<a href="md_readme.html#autotoc_md104"/>
-<a href="md_readme.html#autotoc_md105"/>
+<a href="md_readme.html#autotoc_md118"/>
+<a href="md_readme.html#autotoc_md119"/>
 <a href="md_references.html"/>
 <a href="md_running.html"/>
-<a href="md_running.html#autotoc_md108"/>
-<a href="md_running.html#autotoc_md109"/>
-<a href="md_running.html#autotoc_md110"/>
-<a href="md_running.html#autotoc_md111"/>
-<a href="md_running.html#autotoc_md112"/>
-<a href="md_running.html#autotoc_md113"/>
-<a href="md_running.html#autotoc_md114"/>
+<a href="md_running.html#autotoc_md122"/>
+<a href="md_running.html#autotoc_md123"/>
+<a href="md_running.html#autotoc_md124"/>
+<a href="md_running.html#autotoc_md125"/>
+<a href="md_running.html#autotoc_md126"/>
+<a href="md_running.html#autotoc_md127"/>
+<a href="md_running.html#autotoc_md128"/>
 <a href="md_testing.html"/>
-<a href="md_testing.html#autotoc_md116"/>
-<a href="md_testing.html#autotoc_md117"/>
+<a href="md_testing.html#autotoc_md130"/>
+<a href="md_testing.html#autotoc_md131"/>
 <a href="md_visualization.html"/>
-<a href="md_visualization.html#autotoc_md119"/>
-<a href="md_visualization.html#autotoc_md120"/>
-<a href="md_visualization.html#autotoc_md121"/>
-<a href="md_visualization.html#autotoc_md122"/>
-<a href="md_visualization.html#autotoc_md123"/>
-<a href="md_visualization.html#autotoc_md124"/>
-<a href="md_visualization.html#autotoc_md125"/>
-<a href="md_visualization.html#autotoc_md126"/>
+<a href="md_visualization.html#autotoc_md133"/>
+<a href="md_visualization.html#autotoc_md134"/>
+<a href="md_visualization.html#autotoc_md135"/>
+<a href="md_visualization.html#autotoc_md136"/>
+<a href="md_visualization.html#autotoc_md137"/>
+<a href="md_visualization.html#autotoc_md138"/>
+<a href="md_visualization.html#autotoc_md139"/>
+<a href="md_visualization.html#autotoc_md140"/>
 <a href="pages.html"/>
 </body>
 </html>
@@ -135,7 +135,7 @@
   <div class="headertitle"><div class="title">Papers</div></div>
 </div><!--header-->
 <div class="contents">
-<div class="textblock"><p><a class="anchor" id="autotoc_md102"></a></p>
+<div class="textblock"><p><a class="anchor" id="autotoc_md116"></a></p>
 <p>MFC: An open-source high-order multi-component, multi-phase, and multi-scale compressible flow solver. <a href="https://doi.org/10.1016/j.cpc.2020.107396">S. H. Bryngelson, K. Schmidmayer, V. Coralic, K. Maeda, J. Meng, T. Colonius (2021) Computer Physics Communications <b>266</b>, 107396</a></p>
 <div class="fragment"><div class="line">@article{Bryngelson_2021,</div>
 <div class="line">  title = {{MFC: A}n open-source high-order multi-component, multi-phase, and multi-scale compressible flow solver},</div>
 
@@ -135,8 +135,8 @@
   <div class="headertitle"><div class="title">Documentation</div></div>
 </div><!--header-->
 <div class="contents">
-<div class="textblock"><p><a class="anchor" id="autotoc_md103"></a></p>
-<h1><a class="anchor" id="autotoc_md104"></a>
+<div class="textblock"><p><a class="anchor" id="autotoc_md117"></a></p>
+<h1><a class="anchor" id="autotoc_md118"></a>
 User Documentation</h1>
 <ul>
 <li><a class="el" href="md_getting-started.html">Getting Started</a></li>
@@ -149,7 +149,7 @@ <h1><a class="anchor" id="autotoc_md104"></a>
 <li><a class="el" href="md_authors.html">MFC's Authors</a></li>
 <li><a class="el" href="md_references.html">References</a></li>
 </ul>
-<h1><a class="anchor" id="autotoc_md105"></a>
+<h1><a class="anchor" id="autotoc_md119"></a>
 Code/API Documentation</h1>
 <p>MFC's three codes have their own documentation:</p>
 <ul>
 
@@ -135,7 +135,7 @@
   <div class="headertitle"><div class="title">References</div></div>
 </div><!--header-->
 <div class="contents">
-<div class="textblock"><p><a class="anchor" id="autotoc_md106"></a></p>
+<div class="textblock"><p><a class="anchor" id="autotoc_md120"></a></p>
 <ul>
 <li><a class="anchor" id="Allaire02"></a>Allaire, G., Clerc, S., and Kokh, S. (2002). A five-equation model for the simulation of interfaces between compressible fluids. Journal of Computational Physics, 181(2):577–616.</li>
 </ul>
 
@@ -135,7 +135,7 @@
   <div class="headertitle"><div class="title">Running</div></div>
 </div><!--header-->
 <div class="contents">
-<div class="textblock"><p><a class="anchor" id="autotoc_md107"></a></p>
+<div class="textblock"><p><a class="anchor" id="autotoc_md121"></a></p>
 <p>MFC can be run using <code>mfc.sh</code>'s <code>run</code> command. It supports interactive and batch execution. Batch mode is designed for multi-node distributed systems (supercomputers) equipped with a scheduler such as PBS, SLURM, or LSF. A full (and up-to-date) list of available arguments can be acquired with <code>./mfc.sh run -h</code>.</p>
 <p>MFC supports running simulations locally (Linux, MacOS, and Windows) as well as several supercomputer clusters, both interactively and through batch submission.</p>
 <dl class="section important"><dt>Important</dt><dd>Running simulations locally should work out of the box. On supported clusters, you can append <code>-c &lt;computer name&gt;</code> on the command line to instruct the MFC toolchain to make use of the template file <code>toolchain/templates/&lt;computer name&gt;.mako</code>. You can browse that directory and contribute your own files. Since systems and their schedulers do not have a standardized syntax to request certain resources, MFC can only provide support for a restricted subset of common or user-contributed configuration options. <br  />
@@ -149,7 +149,7 @@
 </ul>
 </dd></dl>
 <p>Please refer to <code>./mfc.sh run -h</code> for a complete list of arguments and options, along with their defaults.</p>
-<h1><a class="anchor" id="autotoc_md108"></a>
+<h1><a class="anchor" id="autotoc_md122"></a>
 Interactive Execution</h1>
 <p>To run all stages of MFC, that is <a href="https://github.com/MFlowCode/MFC/tree/master/src/pre_process/">pre_process</a>, <a href="https://github.com/MFlowCode/MFC/tree/master/src/simulation/">simulation</a>, and <a href="https://github.com/MFlowCode/MFC/tree/master/src/post_process/">post_process</a> on the sample case <a href="https://github.com/MFlowCode/MFC/tree/master/examples/2D_shockbubble/">2D_shockbubble</a>,</p>
 <div class="fragment"><div class="line">./mfc.sh run examples/2D_shockbubble/case.py</div>
@@ -163,7 +163,7 @@ <h1><a class="anchor" id="autotoc_md108"></a>
 <li>Running <a href="https://github.com/MFlowCode/MFC/tree/master/src/simulation/">simulation</a> and <a href="https://github.com/MFlowCode/MFC/tree/master/src/post_process/">post_process</a> using 4 cores:</li>
 </ul>
 <div class="fragment"><div class="line">./mfc.sh run examples/2D_shockbubble/case.py -t simulation post_process -n 4</div>
-</div><!-- fragment --><h1><a class="anchor" id="autotoc_md109"></a>
+</div><!-- fragment --><h1><a class="anchor" id="autotoc_md123"></a>
 Batch Execution</h1>
 <p>The MFC detects which scheduler your system is using and handles the creation and execution of batch scripts. The batch engine is requested via the <code>-e batch</code> option. The number of nodes can be specified with the <code>-N</code> (i.e., <code>--nodes</code>) option.</p>
 <p>We provide a list of (baked-in) submission batch scripts in the <code>toolchain/templates</code> folder.</p>
@@ -178,22 +178,22 @@ <h1><a class="anchor" id="autotoc_md108"></a>
 </ul>
 <p>As an example, one might request GPUs on a SLURM system using the following:</p>
 <p><b>Disclaimer</b>: IBM's JSRUN on LSF-managed computers does not use the traditional node-based approach to allocate resources. Therefore, the MFC constructs equivalent resource sets in the task and GPU count.</p>
-<h2><a class="anchor" id="autotoc_md110"></a>
+<h2><a class="anchor" id="autotoc_md124"></a>
 GPU Profiling</h2>
-<h3><a class="anchor" id="autotoc_md111"></a>
+<h3><a class="anchor" id="autotoc_md125"></a>
 NVIDIA GPUs</h3>
 <p>MFC provides two different arguments to facilitate profiling with NVIDIA Nsight. <b>Please ensure the used argument is placed at the end so their respective flags can be appended.</b></p><ul>
 <li>Nsight Systems (Nsys): <code>./mfc.sh run ... -t simulation --nsys [nsys flags]</code> allows one to visualize MFC's system-wide performance with <a href="https://developer.nvidia.com/nsight-systems">NVIDIA Nsight Systems</a>. NSys is best for understanding the order and execution times of major subroutines (WENO, Riemann, etc.) in MFC. When used, <code>--nsys</code> will run the simulation and generate <code>.nsys-rep</code> files in the case directory for all targets. These files can then be imported into Nsight System's GUI, which can be downloaded <a href="https://developer.nvidia.com/nsight-systems/get-started#latest-Platforms">here</a>. To keep the report files small, it is best to run case files with a few timesteps. Learn more about NVIDIA Nsight Systems <a href="https://docs.nvidia.com/nsight-systems/UserGuide/index.html">here</a>.</li>
 <li>Nsight Compute (NCU): <code>./mfc.sh run ... -t simulation --ncu [ncu flags]</code> allows one to conduct kernel-level profiling with <a href="https://developer.nvidia.com/nsight-compute">NVIDIA Nsight Compute</a>. NCU provides profiling information for every subroutine called and is more detailed than NSys. When used, <code>--ncu</code> will output profiling information for all subroutines, including elapsed clock cycles, memory used, and more after the simulation is run. Adding this argument will significantly slow the simulation and should only be used on case files with a few timesteps. Learn more about NVIDIA Nsight Compute <a href="https://docs.nvidia.com/nsight-compute/NsightCompute/index.html">here</a>.</li>
 </ul>
-<h3><a class="anchor" id="autotoc_md112"></a>
+<h3><a class="anchor" id="autotoc_md126"></a>
 AMD GPUs</h3>
 <ul>
 <li>Rocprof Systems (RSYS): <code>./mfc.sh run ... -t simulation --rsys --hip-trace [rocprof flags]</code> allows one to visualize MFC's system-wide performance with <a href="https://ui.perfetto.dev/">Perfetto UI</a>. When used, <code>--roc</code> will run the simulation and generate files in the case directory for all targets. <code>results.json</code> can then be imported in <a href="https://ui.perfetto.dev/">Perfetto's UI</a>. Learn more about AMD Rocprof <a href="https://rocm.docs.amd.com/projects/rocprofiler/en/docs-5.5.1/rocprof.html">here</a> It is best to run case files with few timesteps to keep the report file sizes manageable.</li>
 <li>Rocprof Compute (RCU): <code>./mfc.sh run ... -t simulation --rcu -n &lt;name&gt; [rocprof-compute flags]</code> allows one to conduct kernel-level profiling with <a href="https://rocm.docs.amd.com/projects/rocprofiler-compute/en/latest/what-is-rocprof-compute.html">ROCm Compute Profiler</a>. When used, <code>--rcu</code> will output profiling information for all subroutines, including rooflines, cache usage, register usage, and more, after the simulation is run. Adding this argument will moderately slow down the simulation and run the MFC executable several times. For this reason, it should only be used with case files with few timesteps.</li>
 </ul>
 <p><a class="anchor" id="restarting-cases"></a> </p>
-<h2><a class="anchor" id="autotoc_md113"></a>
+<h2><a class="anchor" id="autotoc_md127"></a>
 Restarting Cases</h2>
 <p>When running a simulation, MFC generates a <code>./restart_data</code> folder in the case directory that contains <code>lustre_*.dat</code> files that can be used to restart a simulation from saved timesteps. This allows a user to simulate some timestep $X$, then continue it to run to another timestep $Y$, where $Y &gt; X$. The user can also choose to add new patches at the intermediate timestep.</p>
 <p>If you want to restart a simulation,</p>
@@ -289,7 +289,7 @@ <h2><a class="anchor" id="autotoc_md113"></a>
 <div class="line">./mfc.sh run examples/1D_vacuum_restart/restart_case.py -t pre_process simulation</div>
 <div class="line">./mfc.sh run examples/1D_vacuum_restart/case.py -t post_process</div>
 <div class="line">./mfc.sh run examples/1D_vacuum_restart/restart_case.py -t post_process</div>
-</div><!-- fragment --><h2><a class="anchor" id="autotoc_md114"></a>
+</div><!-- fragment --><h2><a class="anchor" id="autotoc_md128"></a>
 Example Runs</h2>
 <ul>
 <li>Oak Ridge National Laboratory's <a href="https://www.olcf.ornl.gov/summit/">Summit</a>:</li>
 
@@ -135,7 +135,7 @@
   <div class="headertitle"><div class="title">Testing</div></div>
 </div><!--header-->
 <div class="contents">
-<div class="textblock"><p><a class="anchor" id="autotoc_md115"></a></p>
+<div class="textblock"><p><a class="anchor" id="autotoc_md129"></a></p>
 <p>To run MFC's test suite, run </p><div class="fragment"><div class="line">./mfc.sh test -j &lt;thread count&gt;</div>
 </div><!-- fragment --><p>It will generate and run test cases, comparing their output to previous runs from versions of MFC considered accurate. <em>golden files</em>, stored in the <code>tests/</code> directory contain this data, aggregating <code>.dat</code> files generated when running MFC. A test is considered passing when our error tolerances are met in order to maintain a high level of stability and accuracy. <code>./mfc.sh test</code> has the following unique options:</p><ul>
 <li><code>-l</code> outputs the full list of tests</li>
@@ -148,7 +148,7 @@
 </ul>
 <p>To specify a computer, pass the <code>-c</code> flag to <code>./mfc.sh run</code> like so: </p><div class="fragment"><div class="line">./mfc.sh test -j &lt;thread count&gt; -- -c &lt;computer name&gt;</div>
 </div><!-- fragment --><p> where <code>&lt;computer name&gt;</code> could be <code>phoenix</code> or any of the others in the <a href="https://github.com/MFlowCode/MFC/tree/master/toolchain/templates">templates</a>). You can create new templates with the appropriate run commands or omit this option. The use of <code>--</code> in the above command passes options to the <code>./mfc.sh run</code> command underlying the <code>./mfc.sh test</code>.</p>
-<h2><a class="anchor" id="autotoc_md116"></a>
+<h2><a class="anchor" id="autotoc_md130"></a>
 Creating Tests</h2>
 <p>Creating and updating test cases can be done with the following command line arguments:</p><ul>
 <li><code>--generate</code> to generate golden files for a new test case</li>
@@ -195,7 +195,7 @@ <h2><a class="anchor" id="autotoc_md116"></a>
 </ul>
 <p>If a trace is empty (that is, the empty string <code>""</code>), it will not appear in the final trace, but any case parameter variations associated with it will still be applied.</p>
 <p>Finally, the case is appended to the <code>cases</code> list, which will be returned by the <code>list_cases</code> function.</p>
-<h2><a class="anchor" id="autotoc_md117"></a>
+<h2><a class="anchor" id="autotoc_md131"></a>
 Testing Post Process</h2>
 <p>To test the post-processing code, append the <code>-a</code> or <code>--test-all</code> option: </p><div class="fragment"><div class="line">./mfc.sh test -a -j 8</div>
 </div><!-- fragment --><p>This argument will re-run the test stack with &lsquo;parallel_io='T&rsquo;<code>, which generates silo_hdf5 files. It will also turn most write parameters (</code>*_wrt<code>) on. Then, it searches through the silo files using</code>h5dump<code>to ensure that there are no</code>NaN<code>s or</code>Infinity<code>s. Although adding this option does not guarantee that accurate</code>.silo` files are generated, it does ensure that the post-process code does not fail or produce malformed data. </p>