Aspera Shares Connect Information (last updated Oct 12, 2022): **************** Obtain credentials for your Aspera Shares Site Step 1. Obtain the Aspera Shares Site Credentials (usually provided by your project manager via email) USERNAME: (Example: SN0020420) PASSWORD: (Example: Y5pkItiMlDay) Step 2. If your share uses at-rest encryption, obtain a File decryption Passphrase. This is now unusual. A. In a web browser, access your FILEPASSURL. (This URL should be provided from a BITS team member.) (Example FILEPASS URL: https://www.broadinstitute.org/aspera/8759f97c-b618-4fa5-b44e-4c3a253b0bc4.php) B. ***Warning: Each link is valid for one time use only *** If the URL returns "Page not found", the URL is no longer valid - please request a new FILEPASSURL from the project manger C. Be sure to securely record the passphrase returned from the URL request, if you lose it you will need to request another FILEPASSURL *************** Choose your method of download Transfer can happen via these methods: 1) Web Browser or 2) 64-bit Linux command line (recommended for sites with >100 files) Method 1 Web download instructions - Go to the Broad Aspera Shares Site (https://shares.broadinstitute.org/) - Login with Aspera Shares Credentials recorded in step #1 - Navigate to your directory of interest - Select files and click download button If you see a pop-up seeking permission to connect to an Aspera Server, please click "Allow" - A pop-up window will ask for the decryption passphrase (recorded in step #2). Please check for the pop-up window behind your browser window if you do not see promptly. Enter you decryption Passphrase if you wish to decrypt the files. - Click "OK" and the download process will commence. Method 2 Linux command line instructions - Install the Aspera CLI client for Linux On the machine you will use for the transfer, download the Aspera CLI client from https://data.broadinstitute.org/aspera_doc/ibm-aspera-cli-3.9.6.1467.159c5b1-linux-64-release.sh This will download a shell script that, when executed will install a command line utility for interacting with several Aspera products, including shares Once unpacked and installed, you will need the ascp command line executable added to your PATH. For example: export PATH=~/.aspera/cli/bin:$PATH You may also want to add the man pages so you can use "man aspera" to see how to use the client export MANPATH=~/.aspera/cli/share/man:$MANPATH *************** Linux command line examples Example 1 Downloading an entire share aspera shares download --username=SN0XXXXXX --password=yyyyyyyyyyyyy --host=shares.broadinstitute.org --destination=/download/destination --source=SN0XXXXXX/ - if you are using and encrypted site aspera shares download --username=SN0XXXXXX --password=yyyyyyyyyyyyy --host=shares.broadinstitute.org --destination=/download/destination --source=SN0XXXXXX/ --content-protect-password=zzzzzzzzzzzzzzz Example 2 Manifest Download aspera shares download --username=SN0XXXXXX --password=yyyyyyyyyyyyy --host=shares.broadinstitute.org --destination=/download/destination --source=SN0XXXXXX/Manifest.txt Example 3 Using Manifest to download all files for file in `awk '{print $1}' Manifest.txt`; do aspera shares download --username=SN0XXXXXX --password=yyyyyyyyyyyyy --host=shares.broadinstitute.org --destination=/download/destination/$file --source=SN0XXXXXX/$file done *************** Frequently asked questions - I accidentally clicked on the passphrase URL and now I can't download the data. What do I do now? Can I get another passphrase? Yes, contact your project manager (or whoever sent you the URL in the first place) They can request another one. - Why are the files I all named ending in .aspera-env and why can't I read them? See next question - How do I know if the share I'm accessing is encrypted and why should I care? If the files listed on the site have names that end in ".aspera-env" they are encrypted. Some sites are encrypted. We do this because the files may include sensitive data and we want to provide the additional privacy in the unlikely case that they are accessed by an unauthorized person. This means you will need to have an encryption/decryption passphrase (in addition to your password) to access the data they contain. This passphrase is retrieved from a special web site that can be accessed once only: please ensure that you retain the encryption/decryption passphrase once you do. When you download you should choose to decrypt the data during the download process and provide the needed passphrase. - It's taking a very long time for me to download my genome data. What's a typical download speed and time it should take to download a BAM file'? This will depend strongly on the speed of your network connection and any other congestion on the network between our endpoints. For a single stream download we typically see a maximum of 10 MB/second. So a 20 GB exome bam would take 2000 seconds or a bit over 1/2 hour. A 200 G whole genome bam would be more like 5 hours. - I have a lot of files, can I download them in parallel? Yes, but it will require some work. Each site will have a Manifest.txt file that lists all the files available on the site and their checksum as calculated by md5sum. If you download that you can split it into several pieces then run those pieces in parallel on the same or different hosts in the same manner as the example above of "Using Manifest to download all files" - What is the best way to verify the integrity of my file download? The Manifest.txt file lists all the files downloadable from the site and gives the md5sum checksum of each one. Most unix/linux installations will have the md5sum command which, when run on the decrypted file, should return the same checksum - Approximately what is the size of my data? It's difficult to predict since there are a number of different data types being delivered and as technologies improved the size of a given data product might change somewhat. If you browse the share with a web client, the size of each file will be displayed so calculating the approximate size should be straightforward. - Should I decrypt my files during download or after download? In general, the answer is, "decrypt files during download." There are probably some situations where an advanced user could improve total throughput but downloading without decryption and decrypting later, but unless you know you have enough space to store two copies of the data and you know how to use the asunprotect command, you shouldn't try it. - Which web browser works best with the Aspera web interface? We have had some reports the the Chrome browser sometimes has issues with the Aspera Connect client. Our best advice is that if you're having issues getting the Aspera Connect client to work right, try a different browser. - I'm getting timeouts every time I try to transfer a file See the section below on Firewall Settings *************** Firewall Settings Because the fasp protocol uses a connectionless method of data transfer, it's possible that corporate firewall will need to be adjusted to permit the traffic to flow. The symptom will be either an error message about "timeout creating ssh session" or timeouts transferring the data itself. To enable the fasp prorocol to work, the firewall need to permit outgoing TCP connections to port 33001 and to permit UDP traffic on port 33001-33010. Summary: Remote IPs Ports Protocol Direction ---------- ----- -------- --------- 69.173.125.33 443 TCP Outgoing 130.211.148.168 33001 TCP Outgoing 130.211.182.54 33001 TCP Outgoing 130.211.186.135 33001 TCP Outgoing 146.148.76.138 33001 TCP Outgoing 23.236.57.4 33001 TCP Outgoing 130.211.143.166 33001 TCP Outgoing 130.211.174.182 33001 TCP Outgoing 130.211.148.168 33001 - 33010 UDP in/out 130.211.182.54 33001 - 33010 UDP in/out 130.211.186.135 33001 - 33010 UDP in/out 146.148.76.138 33001 - 33010 UDP in/out 23.236.57.4 33001 - 33010 UDP in/out 130.211.143.166 33001 - 33010 UDP in/out 130.211.174.182 33001 - 33010 UDP in/out 69.173.124.97 33001 TCP Outgoing 69.173.124.98 33001 TCP Outgoing 69.173.124.97 33001 - 33010 UDP in/out 69.173.124.98 33001 - 33010 UDP in/out *************** Important Notes Some shares do not require encryption. In this case, you won't be notified about a FILEPASSURL. Simply omit the bits about setting ASPERA_SCP_FILEPASS and leave ".aspera-env" out of the examples. Some shares have subdirectories. The example of using Manifest doesn't account for that and would need to be extended to handle that.