Globus may be used to copy large amounts of data from your NC State Google Drive to Research Storage. This includes data stored in your My Drive space as well as Shared Drives (formerly known as Team Drives).
If you have a lot of data stored in Google Drive and are under pressure to reduce your usage to meet quota limitations, Research Storage is the recommended alternative storage location. Globus is the recommended tool to copy research data from Google to Research Storage.
Globus has the ability to not only copy the data but verify the data has been copied fully and without errors to the destination. This is Globus’s default behaviour. You can be confident that Globus made a complete copy of your data if the Globus transfer completes successfully.
However, there is one extremely important caveat:
This page is intended to concisely list the steps to copy data from Google to Research Storage using Globus. This page does not explain how to deal with Google Drive quotas or how to go about deleting data from Google Drive after you’ve copied it elsewhere.
Open this link:
Globus File Manager: NC State Google Drive Connector & Research Storage
Login to Globus with your NC State Unity ID.
See the Login to Globus With Your NC State Account page for more information.
The Globus File Manager will open with the NC State Google Drive Connector Collection shown in left pane and the NC State Research Storage Collection shown in right pane.
If you see a Continue button in either pane, click it to allow Globus to access information it needs about your account:
After clicking Continue, click the link containing your Unity ID:
Then click the Allow button:
Continue until you locate your destination directory:
You can create the destination directory using Globus if it doesn’t already exist:
The following options are not required but recommended. Additional information about each option can be found by clicking the info link next to each option (circle with an “i”).
When you submit the transfer request, a Globus Task is created and Globus will copy the data that you selected in the background. Globus establishes a direct connection between the NC State Google Drive Connector and NC State Research Storage Collections, and the data is copied using this connection.
Because your computer is not involved in transferring the data, you can navigate away from the page or even shut down your computer.
The amount of time it takes to copy the data depends on how much data you selected. You can view the status and progress of the transfer by clicking on the Activity link:
Additional details are shown if you click on the running task. When the transfer is complete, the task will display a green checkmark:
You will also receive an email notification:
Yes.
Globus only copies the data. It does not move the data.
If you are using Globus because your NC State Google Drive is over quota, you will still need to delete the data from Google Drive after Globus has copied it to NC State Research Storage.
Yes.
(Unless you explicitly tell Globus NOT to verify file integrity after transfer)
If you see a green checkmark and message saying transfer completed on the Activity page, you can be assured that all of the data you selected was copied in its entirety and without errors.
By default, Globus verifies that every file copied to the destination matches the source file exactly. You don’t need to do anything to cause Globus to perform these checks, but you have the option to disable them by expanding Transfer & Timer Options and selecting do NOT verify file integrity after transfer:
The Research Storage Manager page on OIT’s Research Services portal shows you all of the storage allocations you have access to. Each Research Storage allocation has a unique directory path. The Research Storage Manager page shows the path for each allocation in a few different formats. You will want to find the HPC Directory path because it matches the directory organization you see in Globus.
Google Drive allows multiple files located in a particular directory to have the same name. Most other file systems including Research Storage do not allow this. Every file in a particular directory must have a unique name.
Globus will generate an ambiguous path - more than one object with that name error when it detects this:
The Globus website offers a brief explanation HERE .
To resolve this problem, you will need to locate the files in your Google Drive and either rename them to be unique or delete the copies you don’t need.
To determine the name if the files which generated the error:
The name of the offending file is displayed on the Command: and Details: lines. Search for this file name in your NC State Google Drive. To only display files with the matching name, enter the following in Google Drive search box:
title:"<DUPLICATE NAME>"
You should see multiple results. Examine the files and either delete or rename them.
Once all of the files with duplicate names have been rectified, attempt the Globus transfer again.
Select the Skip files on source with errors option when you start the Globus transfer.
With Skip files on source with errors selected, Globus will generate a warning for each ambiguous path it encounters and continue to attempt to copy the remaining files requested in the transfer. This does not solve the problem and none of the data contained in any of the files with a duplicate name is transferred. However, it allows you to identify all of the problematic files with duplicate names in a single step by viewing from the Event Log for the transfer.
Without Skip files on source with errors selected, your entire Globus transfer will fail on the first occurrence. If you resolve the problem by renaming or deleting the file which caused the transfer to fail and attempt to run the transfer again, it will likely encounter another ambiguous path error and fail again. The ambiguous path problem is very common. You may have 100’s of files with duplicate names and resolving the problem one file at a time is not practical.