Difference between revisions of "Accessing Swestore with the ARC client"

From SNIC Documentation
Jump to: navigation, search
(Swestore documentation moved)
(Tag: New redirect)
 
(24 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[Category:Grid computing]]
+
#REDIRECT[[Swestore Documentation Moved]]
[[Category:SweGrid user guide]]
 
[[Category:Swestore]]
 
[[Category:Swestore user guide]]
 
 
 
[[Getting started with SweGrid|< Getting started with SweGrid]]<br>
 
[[Swestore|< Swestore]]
 
 
 
This guide describes how to use the [http://www.nordugrid.org Nordugrid] ''ARC'' client for storing and retrieving files from SweStore National Storage. The ARC client is usually used for sending grid jobs to grid clusters, but it also contains commands for data management. A complete user guide for the ARC client can be found in http://www.nordugrid.org/documents/arc-ui.pdf.
 
 
 
= Requirements =
 
To access SweStore national storage using the ARC client you need to [[Grid_certificates#Requesting_a_certificate|get a grid certificate]] and [[Grid_certificates#Requesting_membership_in_the_SweGrid_VO|become a member]] of the SweGrid virtual organisation. If you want access to your own private storage area you need to have a SweStore [[Apply_for_storage_on_SweStore|storage project]].
 
 
 
All SNIC systems have the ARC client installed. If yours doesn't, please contact support at your centre so they can fix this error as soon as possible. To install the ARC client on your own computer, please follow instructions [[ARC_client_installation|here]], or see the official Nordugrid [http://www.nordugrid.org/documents/arc-client-install.html ARC installation] page for more information.
 
 
 
= Quickstart =
 
 
 
== Basic commands ==
 
: <code>arcproxy</code> - unlock your certificate so you can use it. See [[Grid_certificates#Proxy_certificates|Proxy certificates]] for details.
 
: <code>arcls</code> - for listing files. Works similarly to <code>ls</code>. Example <code><nowiki>arcls gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME</nowiki></code>
 
: <code>arcmkdir</code> - for creating directories. Works similarly to <code>mkdir</code>. Example <code><nowiki>arcmkdir gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir</nowiki></code>
 
: <code>arccp</code> - for copying files. Works similarly to <code>cp</code>. Example <code><nowiki>arccp myfile.txt gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/myfile.txt</nowiki></code>
 
: <code>arcrm</code> - for deleting files. Works similarly to <code>rm</code>. Example <code><nowiki>arcrm gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/whoops.txt</nowiki></code>
 
 
 
Use <code>man</code> and <code>--help</code> to get more info on each command. Examples: <code>man arcrm</code> or <code>arcls --help</code>
 
 
 
== Paths ==
 
The ARC commands supports multiple storage protocols, we recommend using GridFTP with paths on the form <code><nowiki>gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/...</nowiki></code> but SRM (Storage Resource Manager) <code><nowiki>srm://srm.swegrid.se/snic/YOUR_PROJECT_NAME/...</nowiki></code> can also be used.
 
 
 
= Copying files =
 
 
 
Copying files to and from resources is accomplished using the '''arccp''' command.
 
 
 
== Copying single files ==
 
 
 
Copying single files is accomplished in the same way as using the
 
normal '''cp''' command as shown in the following example:
 
 
 
$ arccp archive.tar.gz gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/
 
 
 
Please note the trailing / which marks the destination as a directory.
 
Without a / the destination will be a file, which may or may not be
 
what you wanted. All required directories are created when needed so
 
the destination may be a nonexisting directory.
 
 
 
== Recursive copying ==
 
 
 
Recursive copying is accomplished using the '''--recursive''' option
 
to arccp. The argument to the option determines the depth of the
 
recursive copy, just supply a really big number like <code>999</code> if
 
you want the entire source directory tree.
 
 
 
Example:
 
 
 
$ arccp --recursive=999 foobar/ gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/
 
 
 
'''NOTE:''' The above example will copy all files in the directory <code>foobar</code> into
 
the destination directory <code>YOUR_PROJECT_NAME</code>. If you want the directory <code>foobar</code>
 
to be part of the destination path you have to explicitly supply it as shown in the example below:
 
 
 
$ arccp --recursive=999 foobar/ gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/foobar/
 
 
 
== Long-running operations ==
 
 
 
Note that copying large directory trees can take quite some time, and will fail if you're not aware of the following:
 
 
 
* Your login session created with the <code>arcproxy</code> command has a limited lifetime. Use <code>arcproxy -I</code> to show the remaining time. Use <code>arcproxy -c validityPeriod=xxH</code> to initiate a session with longer lifetime.
 
* If you loose connectivity with the resource you're running arccp on the command will abort. A utility such as <code>screen</code> or <code>tmux</code> can be used to create a terminal session you can reattach to.
 
* Transfer rates are largely dependent on the average file size, if you have a lot of small files the transfer will be slower than if you have large files.
 
* We recommend to limit your transfer sessions (ie. the directory tree copied with each arccp command) to 1TB if you have mostly large (100+MB) files and to 100GB if you have smaller files.
 
 
 
= Listing files =
 
 
 
Listing files on a resources is done using the '''arcls''' command. In
 
the simplest form the command just takes a URL as input and displays
 
names and directories without any extra information as shown in the
 
following example:
 
 
 
$ arcls gsiftp://gsiftp.swestore.se/snic/bils/db/uniprot/2012_05
 
reldate.txt
 
speclist.txt
 
uniprot_sprot.dat.gz
 
uniprot_sprot.fasta.gz
 
uniprot_trembl.dat.gz
 
uniprot_trembl.fasta.gz
 
 
 
Additional information can be listed by adding the '''--long''' option:
 
 
 
$ arcls --long gsiftp://gsiftp.swestore.se/snic/bils/db/uniprot/2012_05
 
<Name> <Type> <Size> <Creation> <Validity> <CheckSum> <Latency>
 
reldate.txt file 151 2012-05-23 03:00:19 (n/a) adler32:f3f52f1d (n/a)
 
speclist.txt file 1715169 2012-05-23 03:00:17 (n/a) adler32:91e59dae (n/a)
 
uniprot_sprot.dat.gz file 462895141 2012-05-23 02:57:18 (n/a) adler32:0f131bb2 (n/a)
 
uniprot_sprot.fasta.gz file 79935897 2012-05-23 03:00:20 (n/a) adler32:89844c57 (n/a)
 
uniprot_trembl.dat.gz file 9162678278 2012-05-23 02:52:01 (n/a) adler32:b2d7cfd5 (n/a)
 
uniprot_trembl.fasta.gz file 4456514443 2012-05-23 02:57:34 (n/a) adler32:2b73b2a1 (n/a)
 
 
 
== Metadata ==
 
 
 
Metadatainformation on a specific file can be listed by specifying the '''-m''' or '''--metadata''' option. Worth noting is that the amount of metadata available differs depending on which protocol is used.
 
 
 
Examples:
 
 
 
$ arcls --metadata gsiftp://gsiftp.swestore.se/ops/nikke/smallfile
 
/ops/nikke/smallfile
 
checksum:adler32:762606eb
 
mtime:2013-04-12 11:06:56
 
path:/ops/nikke/smallfile
 
size:30
 
type:file
 
 
 
$ arcls --metadata srm://srm.swegrid.se/ops/nikke/smallfile
 
/ops/nikke/smallfile
 
accessperm:rw-r-----
 
checksum:adler32:762606eb
 
ctime:2013-04-12 11:06:56
 
filestoragetype:PERMANENT
 
group:25001
 
latency:ONLINE
 
lifetimeassigned:PT1S
 
lifetimeleft:PT1S
 
mtime:2013-04-12 11:06:56
 
owner:25001
 
path:/ops/nikke/smallfile
 
size:30
 
spacetokens:
 
type:file
 
 
 
= Creating directories =
 
 
 
$ arcmkdir gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir
 
 
 
If the arcmkdir command is missing the ARC utilities need to be upgraded. You can work around this by copying a dummy file to the path you want and then deleting the dummy file.
 
 
 
= Removing files or directories =
 
 
 
$ arcrm gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir/dummyfile
 
$ arcrm gsiftp://gsiftp.swestore.se/snic/YOUR_PROJECT_NAME/newdir/
 
 
 
To remove directories they have to be empty.
 
 
 
= Known problems =
 
 
 
== ARC 0.8 versus 1.0 ==
 
 
 
In late spring 2011 Nordugrid release the 1.0 version of ARC
 
(sometimes called 11.05). One of the new features of 1.0 compared to
 
the previous 0.8 release was a new command set. Basically most of the
 
ng* commands was replaced with the new arc* commands. Some
 
functionality moved between commands (ngstat became arcinfo and
 
arcstat) and some new commands was introduced (arcproxy as an
 
replacement for grid-proxy-init, which wasn't an arc command at all
 
but a part of the Globus Toolkit). There are still legacy compatibility
 
binaries in place for the old ng* commands, but I strongly suggest
 
that you use arc* when available.
 
 
 
If you on the same local account switch between ng* and arc* commands you may get warnings:
 
 
 
Bad format detected in file /home/jens/.arc/srms.conf, in line srm.swegrid.se 8443 2.2
 
Unwrapped data does not fit into buffer
 
Connection to server failed: Connection refused
 
Connection to server failed: Connection refused
 
or
 
WARNING: Bad or old format detected in file /home/jens/.arc/srms.conf, in line srm.swegrid.se 8443 gsi 2.2
 
WARNING: Bad or old format detected in file /home/jens/.arc/srms.conf, in line srm.swegrid.se 8443 gsi 2.2
 
 
 
There is a file, srm.conf, that gets automatically updated when
 
accessing a resource. ngls and arcls does not agree on the content of
 
that file. There are bug reports about it. That warning is just
 
confusing and shouldn't be displayed. Another attempt using the same
 
command will probably not display those errors again.
 
 
 
== arcproxy 1.0.1 ==
 
 
 
There us a bug in dCache which makes proxy certificates from arcproxy
 
1.0.1 unusable. This is the version distributed in the 11.05-2 standalone and MacOS clients.
 
The error you get from arcls is:
 
 
 
ERROR: Failed listing files
 
 
 
All other version of arcproxy should be fine. If you encounter this
 
version av arcproxy, please use grid-proxy-init if available. The
 
generated proxy certificates should be equivalent.
 

Latest revision as of 09:55, 8 February 2023