Use Globus to Copy Data from NC State Google Drive to Research Storage

Purpose

Globus may be used to copy large amounts of data from your NC State Google Drive to Research Storage. This includes data stored in your My Drive space as well as Shared Drives (formerly known as Team Drives).

If you have a lot of data stored in Google Drive and are under pressure to reduce your usage to meet quota limitations, Research Storage is the recommended alternative storage location. Globus is the recommended tool to copy research data from Google to Research Storage.

Globus has the ability to not only copy the data but verify the data has been copied fully and without errors to the destination. This is Globus’s default behaviour. You can be confident that Globus made a complete copy of your data if the Globus transfer completes successfully.

However, there is one extremely important caveat:

WARNING: Globus does not copy data from Google Docs, Sheets, and other Google native files!
Google Docs, Sheets, Slides and other types of files you create and edit directly from Google Drive are often referred to as Google native files. Google native files
It’s very important to understand that Globus does not copy any of the data contained within these types of files!

When you use Globus to copy from Google Drive to Research Storage or any other storage location, you’ll see files in the destination with the same names as the Google native files you see in Drive but the data you’d expect to be in those files has not been copied. Instead, the destination files only contain a links to the actual data stored in Google Drive.

If you think your Google native file data has been copied and proceed to delete the files from Google Drive, the data will be lost.

There are tools to export Google native files into other file formats and other workarounds for this problem. This is outside the scope of this page but you might be able to find additional information on NC State’s IT Service Portal .

This page is intended to concisely list the steps to copy data from Google to Research Storage using Globus. This page does not explain how to deal with Google Drive quotas or how to go about deleting data from Google Drive after you’ve copied it elsewhere.

Instructions

  1. Open this link:
    Globus File Manager: NC State Google Drive Connector & Research Storage

  2. Login to Globus with your NC State Unity ID.
    See the Login to Globus With Your NC State Account page for more information.

  3. The Globus File Manager will open with the NC State Google Drive Connector Collection shown in left pane and the NC State Research Storage Collection shown in right pane.

    If you see a Continue button in either pane, click it to allow Globus to access information it needs about your account: Globus Web App would like to: After clicking Continue, click the link containing your Unity ID: Click your Unity ID Then click the Allow button: Click Allow

Select Source Data to Copy From Your NC State Google Drive

  1. In the left NC State Google Drive Connector pane, double-click My Drive or Team Drives to browse your NC State Google Drive space to locate files or directories you want to copy to Research Storage: NC State Google Drive displayed in Globus

Select a Destination Directory in Your Research Storage Space

  1. The right NC State Research Storage pane will look like the image below. Double-click on either the rs1 or rsstu folder, then continue to navigate to one of the Research Storage locations you have already have access to.

NC State Research Storage displayed in Globus

Continue until you locate your destination directory: NC State Research Storage destination displayed in Globus

You can create the destination directory using Globus if it doesn’t already exist: Create new directory in Globus

Configure Globus Transfer Options

  1. Click Transfer & Timer Options to expand the options

The following options are not required but recommended. Additional information about each option can be found by clicking the info link next to each option (circle with an “i”).

Begin the Copy From Google Drive to Research Storage

  1. In the NC State Research Storage pane on the right, make sure you are in the correct location where you want the data copied
  2. In the NC State Google Drive Connector pane on the left, select the files or directories you want to copy
  3. Click the Start button: Begin Globus transfer You will see a notification: Globus transfer submitted notification

What happens next?

When you submit the transfer request, a Globus Task is created and Globus will copy the data that you selected in the background. Globus establishes a direct connection between the NC State Google Drive Connector and NC State Research Storage Collections, and the data is copied using this connection.

Because your computer is not involved in transferring the data, you can navigate away from the page or even shut down your computer.

The amount of time it takes to copy the data depends on how much data you selected. You can view the status and progress of the transfer by clicking on the Activity link: Globus task queued

Additional details are shown if you click on the running task. When the transfer is complete, the task will display a green checkmark: Globus task complete

You will also receive an email notification: Globus succeeded email

Frequently Asked Questions

Does the data still exist on my NC State Google Drive after transferring it with Globus?

Yes.

Globus only copies the data. It does not move the data.

If you are using Globus because your NC State Google Drive is over quota, you will still need to delete the data from Google Drive after Globus has copied it to NC State Research Storage.

Does Globus guarantee all the data is copied correctly & completely?

Yes.
(Unless you explicitly tell Globus NOT to verify file integrity after transfer)

If you see a green checkmark and message saying transfer completed on the Activity page, you can be assured that all of the data you selected was copied in its entirety and without errors.

By default, Globus verifies that every file copied to the destination matches the source file exactly. You don’t need to do anything to cause Globus to perform these checks, but you have the option to disable them by expanding Transfer & Timer Options and selecting do NOT verify file integrity after transfer: Globus Transfer & Timer Options

How do I determine the Research Storage path?

The Research Storage Manager page on OIT’s Research Services portal shows you all of the storage allocations you have access to. Each Research Storage allocation has a unique directory path. The Research Storage Manager page shows the path for each allocation in a few different formats. You will want to find the HPC Directory path because it matches the directory organization you see in Globus.


  • Open the Research Storage Manager and log in with your NC State Unity ID and password
  • Locate the HPC Directory row for the Research Storage allocation you want to copy to
  • Go back to the Globus website and either navigate to the path by double-clicking each intermediate directory Followed HPC path in Globus File Manager

Common Problems

ambiguous path - more than one object with that name

Google Drive allows multiple files located in a particular directory to have the same name. Most other file systems including Research Storage do not allow this. Every file in a particular directory must have a unique name.

Globus will generate an ambiguous path - more than one object with that name error when it detects this: Ambiguous path error

The Globus website offers a brief explanation HERE .

How to resolve ambiguous path errors

To resolve this problem, you will need to locate the files in your Google Drive and either rename them to be unique or delete the copies you don’t need.

To determine the name if the files which generated the error:

  • On the left side of the Globus page, click ACTIVITY
  • Open the Task that failed: Globus Activity pane with error
  • Click the Event Log tab, then View Details Globus Event Log details

The name of the offending file is displayed on the Command: and Details: lines. Search for this file name in your NC State Google Drive. To only display files with the matching name, enter the following in Google Drive search box:

title:"<DUPLICATE NAME>"

You should see multiple results. Examine the files and either delete or rename them.

Once all of the files with duplicate names have been rectified, attempt the Globus transfer again.

How to identify all files with ambiguous names

Select the Skip files on source with errors option when you start the Globus transfer.

With Skip files on source with errors selected, Globus will generate a warning for each ambiguous path it encounters and continue to attempt to copy the remaining files requested in the transfer. This does not solve the problem and none of the data contained in any of the files with a duplicate name is transferred. However, it allows you to identify all of the problematic files with duplicate names in a single step by viewing from the Event Log for the transfer.

Without Skip files on source with errors selected, your entire Globus transfer will fail on the first occurrence. If you resolve the problem by renaming or deleting the file which caused the transfer to fail and attempt to run the transfer again, it will likely encounter another ambiguous path error and fail again. The ambiguous path problem is very common. You may have 100’s of files with duplicate names and resolving the problem one file at a time is not practical.