Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable transfers of archived data to Niagara from non-HPSS-connected systems #1357

Open
DavidHuber-NOAA opened this issue Mar 1, 2023 · 15 comments · May be fixed by #3240
Open

Enable transfers of archived data to Niagara from non-HPSS-connected systems #1357

DavidHuber-NOAA opened this issue Mar 1, 2023 · 15 comments · May be fixed by #3240
Assignees
Labels
feature New feature or request

Comments

@DavidHuber-NOAA
Copy link
Contributor

Description

As part of the archive (earc, arch) jobs, an option to push data to Niagara from Globus Endpoint machines would allow users to stage data for archive to HPSS. This feature would require that LOCAL_ARCH="YES" be set. This option would be useful for Orion, S4, and eventually Hercules.

Acceptance Criteria (Definition of Done)

Data is pushed to HPSS from Orion and verified against the original .tar files created on Orion.

(Optional): Suggest A Solution

The globus transfers could be made part of the *arc j-jobs and added after each successful creation of a tar file.

At the end of each *arc script, an HPSS script would be generated and globus transferred to Niagara containing a list of all .tar files that were sent. A crontab entry could then be placed on Niagara to look for and execute these scripts and report any errors via email.

@erinjones2 developed scripts that worked on Orion for the feature/ops-orion branch. I plan on adapting these to the purpose.

@DavidHuber-NOAA DavidHuber-NOAA added the feature New feature or request label Mar 1, 2023
DavidHuber-NOAA added a commit to DavidHuber-NOAA/global-workflow that referenced this issue Mar 23, 2023
DavidHuber-NOAA added a commit to DavidHuber-NOAA/global-workflow that referenced this issue Mar 23, 2023
DavidHuber-NOAA added a commit to DavidHuber-NOAA/global-workflow that referenced this issue Mar 23, 2023
@DavidHuber-NOAA
Copy link
Contributor Author

DavidHuber-NOAA commented May 5, 2023

The coding work for this issue is complete. I am waiting for the rstprod group to be added to Niagara before updating the branch, completing testing, and submitting a PR.

@KateFriedman-NOAA
Copy link
Member

@arunchawla-NOAA Can we submit a request to get rstprod on Niagara? It would help with the lack of direct HPSS access from Orion (and Hercules). Thanks!

@DavidHuber-NOAA DavidHuber-NOAA self-assigned this Oct 10, 2023
@DavidHuber-NOAA
Copy link
Contributor Author

This will remain blocked until rstprod is added to Niagara.

@DavidHuber-NOAA DavidHuber-NOAA removed their assignment Mar 22, 2024
@CatherineThomas-NOAA
Copy link
Contributor

Curious about the status of this as we evaluate whether we can use Orion/Hercules for future GFSv17 prototypes/retros. Was there ever a request made to get rstprod on Niagara?

@CatherineThomas-NOAA
Copy link
Contributor

Tagging @JacobCarley-NOAA on this as acting EIB chief.

@KateFriedman-NOAA Do you know if Arun ever made the request for rstprod on Niagara?

@KateFriedman-NOAA
Copy link
Member

Do you know if Arun ever made the request for rstprod on Niagara?

To my knowledge he never submitted the request. @JacobCarley-NOAA is this something you could submit?

@CatherineThomas-NOAA
Copy link
Contributor

@KateFriedman-NOAA @DavidHuber-NOAA @JessicaMeixner-NOAA @JacobCarley-NOAA

Good news everyone: Rstprod has now been added to Niagara.

@KateFriedman-NOAA
Copy link
Member

Hooray! Thanks for the update @CatherineThomas-NOAA !

@DavidHuber-NOAA DavidHuber-NOAA self-assigned this Jun 17, 2024
@DavidHuber-NOAA
Copy link
Contributor Author

I am working with Georgy Fekete to establish a workflow for these transfers (RDHPCS ticket number 2024061454000228).

@DavidHuber-NOAA
Copy link
Contributor Author

I met with George Fekete today who showed off a generic tool for pushing and pulling from HPSS via Niagara (or any HPSS-connected machine). I will be testing the tool out then start implementing it into the workflow.

@CatherineThomas-NOAA
Copy link
Contributor

Hi @DavidHuber-NOAA. I'm curious as to why this has moved from Todo to Blocked. Is there anything we can help with/escalate?

@DavidHuber-NOAA
Copy link
Contributor Author

DavidHuber-NOAA commented Nov 15, 2024

Hi @CatherineThomas-NOAA I am waiting on code delivery from George for a generic transfer tool. Once I have that, I should be able to move forward. I will ping him today and see where that is at.

@DavidHuber-NOAA
Copy link
Contributor Author

DavidHuber-NOAA commented Jan 15, 2025

@KateFriedman-NOAA @WalterKolczynski-NOAA For the gfs_globus, etc, jobs, I would like to store the dictionaries created by the associated gfs_arch, etc, tasks as YAMLs. These dictionaries contain the name of the tarball and whether they have rstprod data or not. What COM location makes sense for this? I was thinking about COM_CONF, but I'm open to suggestions.

@KateFriedman-NOAA
Copy link
Member

COM_CONF seems good to me. The dictionaries will be removed at the end of the job or when?

@DavidHuber-NOAA
Copy link
Contributor Author

OK, sounds good.

At the end of the job would be fine, I think. If globus needs to be run again, the arch tasks would need to be run again to generate the YAMLs, but those are lightweight tasks.

@DavidHuber-NOAA DavidHuber-NOAA linked a pull request Jan 17, 2025 that will close this issue
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants