-
Notifications
You must be signed in to change notification settings - Fork 698
Training an TFNO with navier-stokes, with flops count #583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hi Natalie,
Thank you for reaching out, glad to hear you’ve been using our TFNO!
Yes, we certainly welcome all contributions and the ones you mention sound
very valuable! Optimizing the hyper parameters of the TFNO to speed up
forward and backward pass, along with other optimization is particularly
interesting to me, especially since we put a lot of effort on providing a
simple API for the factorized forward passes.
How do you optimize the CPU-GPU communication? I assume this is for the
case where the data is very large resolution, and has to be streamed from
disk?
Happy to setup a chat to discuss the details.
Best,
Jean
…On Sat, Apr 19, 2025 at 12:21 AM Natalie Pham ***@***.***> wrote:
I've been working with TFNO models and recently developed a script that
demonstrates model performance along with FLOPs analysis for both forward
and backward passes.
I'd like to contribute to the NeuralOperator project by developing a
training example and accompanying documentation that:
- Demonstrates TFNO performance on the 2D Navier-Stokes equations
- Includes FLOPs profiling for model introspection and optimization
- Trains TFNO on multiple GPUs, with ongoing work to optimize
communication loops between GPUs
- Discusses strategies for efficient CPU–GPU communication during
training
Please let me know if this would be a valuable addition to the project —
I've opened a PR and would greatly appreciate any feedback as I iterate.
------------------------------
You can view, comment on, or merge this pull request online at:
#583
Commit Summary
- f3984ec
<f3984ec>
Training an TFNO with navier-stokes, with flops count
File Changes
(1 file <https://github.com/neuraloperator/neuraloperator/pull/583/files>)
- *A* examples/training/train_TFNO_NavierStoke_flops_count.py
<https://github.com/neuraloperator/neuraloperator/pull/583/files#diff-57ada29790fe0cdca59fb8467c92134029e4b2fa8012af5dea1692526f0452e8>
(137)
Patch Links:
- https://github.com/neuraloperator/neuraloperator/pull/583.patch
- https://github.com/neuraloperator/neuraloperator/pull/583.diff
—
Reply to this email directly, view it on GitHub
<#583>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGLIR3G4DC2VGGGTZ5TDNL22H2OJAVCNFSM6AAAAAB3OIYR4WVHI2DSMVQWIX3LMV43ASLTON2WKOZTGAYDMMRWGIYTKMI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Hi Jean, Thank you for your quick reply. I’ve used torch.profiler to record memory usage and kernel activity on a per-epoch basis. For CPU–GPU transfers, I’ve enabled pin_memory=True and non_blocking=True and set up asynchronous data loading to handle larger batch volumes. When working with very high-resolution data, I’m exploring a distributed streaming approach, but I haven’t yet found any existing functionality for that in the NeuralOperators codebase. If I’ve overlooked something, could you point me to the relevant module or function? Otherwise, any guidance on where to start implementing distributed data streaming would be greatly appreciated. Thanks again for your help! Best, |
Meanwhile, I’d be grateful for any feedback or suggestions you have on my TFNO example using the Navier–Stokes dataset. |
I've been working with TFNO models and recently developed a script that demonstrates model performance along with FLOPs analysis for both forward and backward passes.
I'd like to contribute to the NeuralOperator project by developing a training example and accompanying documentation that:
Please let me know if this would be a valuable addition to the project — I've opened a PR and would greatly appreciate any feedback as I iterate.