Difference between revisions of "Accessing Swestore with the ARC client"
(→Quickstart) |
(→Unlock your certificate) |
||
(28 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
[[Category:Grid computing]] | [[Category:Grid computing]] | ||
− | |||
− | |||
[[Category:SweGrid user guide]] | [[Category:SweGrid user guide]] | ||
− | [[ | + | [[Category:Swestore]] |
− | [[Swestore | + | [[Category:Swestore user guide]] |
− | This guide describes how to use the [http://www.nordugrid.org Nordugrid] ''ARC'' client for storing and retrieving files from | + | [[Swestore|< Swestore]] |
+ | |||
+ | This guide describes how to use the [http://www.nordugrid.org Nordugrid] ''ARC'' client for storing and retrieving files from Swestore. The ARC client is usually used for sending grid jobs to grid clusters, but it also contains commands for data management. A complete user guide for the ARC client can be found in http://www.nordugrid.org/documents/arc-ui.pdf. | ||
= Requirements = | = Requirements = | ||
− | To access | + | To access Swestore using the ARC client you need to [[Grid_certificates|have an eScience client certificate]] and a [[Swestore|storage project]]. |
− | All SNIC systems have the ARC client installed. If yours doesn't, please contact support at your centre so they can fix this | + | |
+ | You also need to have the certificate installed on the resource where you want to run the ARC commands. For SNIC resources this process includes [[Exporting_a_client_certificate|exporting the certificate from your browser]], transfering it to the intended SNIC resource and [[Preparing_a_client_certificate|prepare it for use with grid tools]]. | ||
+ | |||
+ | All SNIC HPC systems should have the ARC client installed. If yours doesn't, please contact support at your centre so they can fix this as soon as possible. To install the ARC client on your own computer, please follow instructions [[ARC_client_installation|here]], or see the official Nordugrid [http://www.nordugrid.org/documents/arc-client-install.html ARC installation] page for more information. | ||
= Quickstart = | = Quickstart = | ||
− | Basic commands | + | |
+ | == Basic commands == | ||
: <code>arcproxy</code> - unlock your certificate so you can use it. See [[Grid_certificates#Proxy_certificates|Proxy certificates]] for details. | : <code>arcproxy</code> - unlock your certificate so you can use it. See [[Grid_certificates#Proxy_certificates|Proxy certificates]] for details. | ||
− | : <code>arcls</code> - for listing files. Works similarly to <code>ls</code>. Example <code><nowiki>arcls | + | : <code>arcls</code> - for listing files. Works similarly to <code>ls</code>. Example <code><nowiki>arcls gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME</nowiki></code> |
− | : <code> | + | : <code>arcmkdir</code> - for creating directories. Works similarly to <code>mkdir</code>. Example <code><nowiki>arcmkdir gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir</nowiki></code> |
− | : <code> | + | : <code>arccp</code> - for copying files. Works similarly to <code>cp</code>. Example <code><nowiki>arccp myfile.txt gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/myfile.txt</nowiki></code> |
− | + | : <code>arcrm</code> - for deleting files. Works similarly to <code>rm</code>. Example <code><nowiki>arcrm gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/whoops.txt</nowiki></code> | |
+ | |||
+ | Use <code>man</code> and <code>--help</code> to get more info on each command. Examples: <code>man arcrm</code> or <code>arcls --help</code> | ||
+ | |||
+ | == Paths == | ||
+ | The ARC commands supports multiple storage protocols, we recommend using GridFTP with paths on the form <code><nowiki>gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/...</nowiki></code>. | ||
+ | |||
+ | = Unlock your certificate = | ||
+ | |||
+ | Your certificate needs to be unlocked before you can do anything. Think of the process as logging in. When successful, a ''proxy certificate'' is the result. | ||
+ | |||
+ | $ arcproxy | ||
+ | |||
+ | To see the lifetime of your session, use: | ||
+ | |||
+ | $ arcproxy -I | ||
= Copying files = | = Copying files = | ||
Line 29: | Line 48: | ||
normal '''cp''' command as shown in the following example: | normal '''cp''' command as shown in the following example: | ||
− | $ arccp archive.tar.gz | + | $ arccp archive.tar.gz gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/ |
Please note the trailing / which marks the destination as a directory. | Please note the trailing / which marks the destination as a directory. | ||
Line 40: | Line 59: | ||
Recursive copying is accomplished using the '''--recursive''' option | Recursive copying is accomplished using the '''--recursive''' option | ||
to arccp. The argument to the option determines the depth of the | to arccp. The argument to the option determines the depth of the | ||
− | recursive copy. | + | recursive copy, just supply a really big number like <code>999</code> if |
+ | you want the entire source directory tree. | ||
+ | |||
+ | Example: | ||
+ | |||
+ | $ arccp --recursive=999 foobar/ gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/ | ||
+ | |||
+ | '''NOTE:''' The above example will copy all files in the directory <code>foobar</code> into | ||
+ | the destination directory <code>YOUR_PROJECT_NAME</code>. If you want the directory <code>foobar</code> | ||
+ | to be part of the destination path you have to explicitly supply it as shown in the example below: | ||
+ | |||
+ | $ arccp --recursive=999 foobar/ gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/foobar/ | ||
+ | |||
+ | == Long-running operations == | ||
+ | |||
+ | Note that copying large directory trees can take quite some time, and might fail if you're not aware of the following: | ||
− | + | * Your login session created with the <code>arcproxy</code> command has a limited lifetime. Use <code>arcproxy -I</code> to show the remaining time. Use <code>arcproxy -c validityPeriod=xxH</code> to initiate a session with longer lifetime. | |
+ | * The command will abort if you lose your network connection with the computer where you are running arccp. A utility such as <code>screen</code> or <code>tmux</code> can be used to create a terminal session you can reattach to. | ||
+ | * Transfer rates are largely dependent on the average file size, if you have a lot of small files the transfer will be slower than if you have large files. | ||
+ | * We recommend to limit your transfer sessions (ie. the directory tree copied with each arccp command) to 1TB if you have mostly large (100+MB) files and to 100GB if you have smaller files. | ||
= Listing files = | = Listing files = | ||
Line 51: | Line 88: | ||
following example: | following example: | ||
− | $ arcls | + | $ arcls gsiftp://gsiftp.swestore.se/snic/bils/db/uniprot/2012_05 |
+ | reldate.txt | ||
+ | speclist.txt | ||
+ | uniprot_sprot.dat.gz | ||
+ | uniprot_sprot.fasta.gz | ||
+ | uniprot_trembl.dat.gz | ||
+ | uniprot_trembl.fasta.gz | ||
Additional information can be listed by adding the '''--long''' option: | Additional information can be listed by adding the '''--long''' option: | ||
− | $ arcls --long | + | $ arcls --long gsiftp://gsiftp.swestore.se/snic/bils/db/uniprot/2012_05 |
<Name> <Type> <Size> <Creation> <Validity> <CheckSum> <Latency> | <Name> <Type> <Size> <Creation> <Validity> <CheckSum> <Latency> | ||
− | reldate.txt file 151 2012-05-23 03:00:19 (n/a) adler32:f3f52f1d | + | reldate.txt file 151 2012-05-23 03:00:19 (n/a) adler32:f3f52f1d (n/a) |
− | speclist.txt file 1715169 2012-05-23 03:00:17 (n/a) adler32:91e59dae | + | speclist.txt file 1715169 2012-05-23 03:00:17 (n/a) adler32:91e59dae (n/a) |
− | uniprot_sprot.dat.gz file 462895141 2012-05-23 02:57:18 (n/a) adler32:0f131bb2 | + | uniprot_sprot.dat.gz file 462895141 2012-05-23 02:57:18 (n/a) adler32:0f131bb2 (n/a) |
− | uniprot_sprot.fasta.gz file 79935897 2012-05-23 03:00:20 (n/a) adler32:89844c57 | + | uniprot_sprot.fasta.gz file 79935897 2012-05-23 03:00:20 (n/a) adler32:89844c57 (n/a) |
− | uniprot_trembl.dat.gz file 9162678278 2012-05-23 02:52:01 (n/a) adler32:b2d7cfd5 | + | uniprot_trembl.dat.gz file 9162678278 2012-05-23 02:52:01 (n/a) adler32:b2d7cfd5 (n/a) |
− | uniprot_trembl.fasta.gz file 4456514443 2012-05-23 02:57:34 (n/a) adler32:2b73b2a1 | + | uniprot_trembl.fasta.gz file 4456514443 2012-05-23 02:57:34 (n/a) adler32:2b73b2a1 (n/a) |
+ | |||
+ | == Metadata == | ||
+ | |||
+ | Metadatainformation on a specific file can be listed by specifying the '''-m''' or '''--metadata''' option. Worth noting is that the amount of metadata available differs depending on which protocol is used. | ||
+ | |||
+ | Examples: | ||
− | + | $ arcls --metadata gsiftp://gsiftp.swestore.se/ops/nikke/smallfile | |
+ | /ops/nikke/smallfile | ||
+ | checksum:adler32:762606eb | ||
+ | mtime:2013-04-12 11:06:56 | ||
+ | path:/ops/nikke/smallfile | ||
+ | size:30 | ||
+ | type:file | ||
− | $ arcls --metadata srm://srm.swegrid.se/ops/ | + | $ arcls --metadata srm://srm.swegrid.se/ops/nikke/smallfile |
− | / | + | /ops/nikke/smallfile |
− | accessperm:rw-r-- | + | accessperm:rw-r----- |
− | checksum:adler32: | + | checksum:adler32:762606eb |
− | ctime: | + | ctime:2013-04-12 11:06:56 |
filestoragetype:PERMANENT | filestoragetype:PERMANENT | ||
− | group: | + | group:25001 |
latency:ONLINE | latency:ONLINE | ||
lifetimeassigned:PT1S | lifetimeassigned:PT1S | ||
lifetimeleft:PT1S | lifetimeleft:PT1S | ||
− | mtime: | + | mtime:2013-04-12 11:06:56 |
− | owner: | + | owner:25001 |
− | path:/ | + | path:/ops/nikke/smallfile |
− | size: | + | size:30 |
spacetokens: | spacetokens: | ||
type:file | type:file | ||
= Creating directories = | = Creating directories = | ||
+ | Directories are generally created on demand. If you copy a file with the destination /snic/YOUR_PROJECT_NAME/newdir/dummyfile the newdir directory will be created if missing. But you can explicitly create directories using the arcmkdir command. | ||
− | + | $ arcmkdir gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir | |
− | |||
− | |||
− | |||
− | |||
− | $ | ||
− | |||
− | |||
− | |||
− | |||
= Removing files or directories = | = Removing files or directories = | ||
− | $ arcrm | + | $ arcrm gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir/dummyfile |
− | $ arcrm | + | $ arcrm gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir/ |
To remove directories they have to be empty. | To remove directories they have to be empty. | ||
− | = | + | = FAQ = |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | + | * I get this message when I try to list files: | |
+ | $ arcls gsiftp://gsiftp.swestore.se/snic/ | ||
+ | ERROR: Unsupported URL given | ||
+ | The nordugrid-arc-plugins-globus package is missing. Without it ARC is not able to use the gsiftp protocol. | ||
− | + | * <code>arcproxy</code> gives WARNING or ERROR messages. | |
− | + | ** The most common reason is a missing certificate file. See [[#Requirements]] | |
− |
Revision as of 14:51, 1 June 2017
This guide describes how to use the Nordugrid ARC client for storing and retrieving files from Swestore. The ARC client is usually used for sending grid jobs to grid clusters, but it also contains commands for data management. A complete user guide for the ARC client can be found in http://www.nordugrid.org/documents/arc-ui.pdf.
Contents
Requirements
To access Swestore using the ARC client you need to have an eScience client certificate and a storage project.
You also need to have the certificate installed on the resource where you want to run the ARC commands. For SNIC resources this process includes exporting the certificate from your browser, transfering it to the intended SNIC resource and prepare it for use with grid tools.
All SNIC HPC systems should have the ARC client installed. If yours doesn't, please contact support at your centre so they can fix this as soon as possible. To install the ARC client on your own computer, please follow instructions here, or see the official Nordugrid ARC installation page for more information.
Quickstart
Basic commands
arcproxy
- unlock your certificate so you can use it. See Proxy certificates for details.arcls
- for listing files. Works similarly tols
. Examplearcls gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME
arcmkdir
- for creating directories. Works similarly tomkdir
. Examplearcmkdir gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir
arccp
- for copying files. Works similarly tocp
. Examplearccp myfile.txt gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/myfile.txt
arcrm
- for deleting files. Works similarly torm
. Examplearcrm gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/whoops.txt
Use man
and --help
to get more info on each command. Examples: man arcrm
or arcls --help
Paths
The ARC commands supports multiple storage protocols, we recommend using GridFTP with paths on the form gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/...
.
Unlock your certificate
Your certificate needs to be unlocked before you can do anything. Think of the process as logging in. When successful, a proxy certificate is the result.
$ arcproxy
To see the lifetime of your session, use:
$ arcproxy -I
Copying files
Copying files to and from resources is accomplished using the arccp command.
Copying single files
Copying single files is accomplished in the same way as using the normal cp command as shown in the following example:
$ arccp archive.tar.gz gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/
Please note the trailing / which marks the destination as a directory. Without a / the destination will be a file, which may or may not be what you wanted. All required directories are created when needed so the destination may be a nonexisting directory.
Recursive copying
Recursive copying is accomplished using the --recursive option
to arccp. The argument to the option determines the depth of the
recursive copy, just supply a really big number like 999
if
you want the entire source directory tree.
Example:
$ arccp --recursive=999 foobar/ gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/
NOTE: The above example will copy all files in the directory foobar
into
the destination directory YOUR_PROJECT_NAME
. If you want the directory foobar
to be part of the destination path you have to explicitly supply it as shown in the example below:
$ arccp --recursive=999 foobar/ gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/foobar/
Long-running operations
Note that copying large directory trees can take quite some time, and might fail if you're not aware of the following:
- Your login session created with the
arcproxy
command has a limited lifetime. Usearcproxy -I
to show the remaining time. Usearcproxy -c validityPeriod=xxH
to initiate a session with longer lifetime. - The command will abort if you lose your network connection with the computer where you are running arccp. A utility such as
screen
ortmux
can be used to create a terminal session you can reattach to. - Transfer rates are largely dependent on the average file size, if you have a lot of small files the transfer will be slower than if you have large files.
- We recommend to limit your transfer sessions (ie. the directory tree copied with each arccp command) to 1TB if you have mostly large (100+MB) files and to 100GB if you have smaller files.
Listing files
Listing files on a resources is done using the arcls command. In the simplest form the command just takes a URL as input and displays names and directories without any extra information as shown in the following example:
$ arcls gsiftp://gsiftp.swestore.se/snic/bils/db/uniprot/2012_05 reldate.txt speclist.txt uniprot_sprot.dat.gz uniprot_sprot.fasta.gz uniprot_trembl.dat.gz uniprot_trembl.fasta.gz
Additional information can be listed by adding the --long option:
$ arcls --long gsiftp://gsiftp.swestore.se/snic/bils/db/uniprot/2012_05 <Name> <Type> <Size> <Creation> <Validity> <CheckSum> <Latency> reldate.txt file 151 2012-05-23 03:00:19 (n/a) adler32:f3f52f1d (n/a) speclist.txt file 1715169 2012-05-23 03:00:17 (n/a) adler32:91e59dae (n/a) uniprot_sprot.dat.gz file 462895141 2012-05-23 02:57:18 (n/a) adler32:0f131bb2 (n/a) uniprot_sprot.fasta.gz file 79935897 2012-05-23 03:00:20 (n/a) adler32:89844c57 (n/a) uniprot_trembl.dat.gz file 9162678278 2012-05-23 02:52:01 (n/a) adler32:b2d7cfd5 (n/a) uniprot_trembl.fasta.gz file 4456514443 2012-05-23 02:57:34 (n/a) adler32:2b73b2a1 (n/a)
Metadata
Metadatainformation on a specific file can be listed by specifying the -m or --metadata option. Worth noting is that the amount of metadata available differs depending on which protocol is used.
Examples:
$ arcls --metadata gsiftp://gsiftp.swestore.se/ops/nikke/smallfile /ops/nikke/smallfile checksum:adler32:762606eb mtime:2013-04-12 11:06:56 path:/ops/nikke/smallfile size:30 type:file
$ arcls --metadata srm://srm.swegrid.se/ops/nikke/smallfile /ops/nikke/smallfile accessperm:rw-r----- checksum:adler32:762606eb ctime:2013-04-12 11:06:56 filestoragetype:PERMANENT group:25001 latency:ONLINE lifetimeassigned:PT1S lifetimeleft:PT1S mtime:2013-04-12 11:06:56 owner:25001 path:/ops/nikke/smallfile size:30 spacetokens: type:file
Creating directories
Directories are generally created on demand. If you copy a file with the destination /snic/YOUR_PROJECT_NAME/newdir/dummyfile the newdir directory will be created if missing. But you can explicitly create directories using the arcmkdir command.
$ arcmkdir gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir
Removing files or directories
$ arcrm gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir/dummyfile $ arcrm gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir/
To remove directories they have to be empty.
FAQ
- I get this message when I try to list files:
$ arcls gsiftp://gsiftp.swestore.se/snic/ ERROR: Unsupported URL given
The nordugrid-arc-plugins-globus package is missing. Without it ARC is not able to use the gsiftp protocol.
arcproxy
gives WARNING or ERROR messages.- The most common reason is a missing certificate file. See #Requirements