Difference between revisions of "FASTA"

From SNIC Documentation
Jump to: navigation, search
 
(8 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[Category:Bioinformatics]]
+
{{software info
[http://fasta.bioch.virginia.edu/fasta_www2/fasta_list2.shtml FASTA] is a software package for aligning nucleotide or amino acid sequences. Its primary use is to search databases for sequences that are similar to a given candidate sequence.
+
|description=package for aligning nucleotide or amino acid sequences
 
+
|license=free
Responsible person: [[User:Joel Hedlund (NSC)]]
+
|fields=bioinformatics
 +
}}
 +
{{PAGENAME}} is a software {{#show: {{PAGENAME}}|?description}}. Its primary use is to search databases for sequences that are similar to a given candidate sequence.
  
 
== Computational considerations ==
 
== Computational considerations ==
Line 16: Line 18:
 
* '''Choose a system with enough RAM''' <br/> Multiprocessor systems generally have more memory than single processor systems, and the database will also require proportionally less memory, since only one copy is needed in the OS file cache regardless of the number of processors using it.
 
* '''Choose a system with enough RAM''' <br/> Multiprocessor systems generally have more memory than single processor systems, and the database will also require proportionally less memory, since only one copy is needed in the OS file cache regardless of the number of processors using it.
 
* '''Partition the search space''' <br/> For huge databases or very restricted amounts available memory it may be required to split the database into manageable chunks and process them as separate jobs.
 
* '''Partition the search space''' <br/> For huge databases or very restricted amounts available memory it may be required to split the database into manageable chunks and process them as separate jobs.
 +
 +
== Availability ==
 +
{{list resources for software}}
 +
 +
== License ==
 +
{{show license}}
 +
 +
== Experts ==
 +
{{list experts}}
  
 
== Links ==
 
== Links ==
 
* [http://fasta.bioch.virginia.edu/fasta_www2/fasta_list2.shtml Official site]
 
* [http://fasta.bioch.virginia.edu/fasta_www2/fasta_list2.shtml Official site]

Latest revision as of 12:28, 12 September 2011

FASTA is a software package for aligning nucleotide or amino acid sequences. Its primary use is to search databases for sequences that are similar to a given candidate sequence.

Computational considerations

Work locally

Many of the features in FASTA require access to database flatfiles, and standard practice when running a compute cluster is to copy all necessary files to a node local directory before any work is done with them. This behaviour is highly encouraged on most resources, since multiple simultaneous accesses to the same large files on a shared disk is likely to cause problems for all computations currently running on the resource, and not only for the owner of the badly behaving jobs.

Do not run out of memory

If possible, you should ensure that you have enough RAM to hold the database as well as the results and still have some headroom. This ensures that FASTA will not need to read data from disk unnecessarily, which otherwise would cause significant slowdown. This can be done for example by:

  • Choose a system with enough RAM
    Multiprocessor systems generally have more memory than single processor systems, and the database will also require proportionally less memory, since only one copy is needed in the OS file cache regardless of the number of processors using it.
  • Partition the search space
    For huge databases or very restricted amounts available memory it may be required to split the database into manageable chunks and process them as separate jobs.

Availability

ResourceCentreDescription
TriolithNSCCapability cluster with 338 TFLOPS peak and 1:2 Infiniband fat-tree

License

License: Free.

Experts

No experts have currently registered expertise on this specific subject. List of registered field experts:

  FieldAE FTEGeneral activities
Anders Hast (UPPMAX)UPPMAXVisualisation, Digital Humanities30Software and usability for projects in digital humanities
Anders Sjölander (UPPMAX)UPPMAXBioinformatics100Bioinformatics support and training, job efficiency monitoring, project management
Anders Sjöström (LUNARC)LUNARCGPU computing
MATLAB
General programming
Technical acoustics
50Helps users with MATLAB, General programming, Image processing, Usage of clusters
Birgitte Brydsö (HPC2N)HPC2NParallel programming
HPC
Training, general support
Björn Claremar (UPPMAX)UPPMAXMeteorology, Geoscience100Support for geosciences, Matlab
Björn Viklund (UPPMAX)UPPMAXBioinformatics
Containers
100Bioinformatics, containers, software installs at UPPMAX
Chandan Basu (NSC)NSCComputational science100EU projects IS-ENES and PRACE.
Working on climate and weather codes
Diana Iusan (UPPMAX)UPPMAXComputational materials science
Performance tuning
50Compilation, performance optimization, and best practice usage of electronic structure codes.
Frank Bramkamp (NSC)NSCComputational fluid dynamics100Installation and support of computational fluid dynamics software.
Hamish Struthers (NSC)NSCClimate research80Users support focused on weather and climate codes.
Henric Zazzi (PDC)PDCBioinformatics100Bioinformatics Application support
Jens Larsson (NSC)NSCSwestore
Jerry Eriksson (HPC2N)HPC2NParallel programming
HPC
HPC, Parallel programming
Joachim Hein (LUNARC)LUNARCParallel programming
Performance optimisation
85HPC training
Parallel programming support
Performance optimisation
Johan HellsvikPDCMaterialvetenskap30materials theory, modeling of organic magnetic materials,
Johan Raber (NSC)NSCComputational chemistry50
Jonas Lindemann (LUNARC)LUNARCGrid computing
Desktop environments
20Coordinating SNIC Emerging Technologies
Developer of ARC Job Submission Tool
Grid user documentation
Leading the development of ARC Storage UI
Lunarc Box
Lunarc HPC Desktop
Krishnaveni Chitrapu (NSC)NSCSoftware development
Lars Eklund (UPPMAX)UPPMAXChemistry
Data management
FAIR
Sensitive data
100Chemistry codes, databases at UPPMAX, sensitive data, PUBA agreements
Lars Viklund (HPC2N)HPC2NGeneral programming
HPC
HPC, General programming, installation of software, support, containers
Lilit Axner (PDC)PDCComputational fluid dynamics50
Marcus Lundberg (UPPMAX)UPPMAXComputational science
Parallel programming
Performance tuning
Sensitive data
100I help users with productivity, program performance, and parallelisation. I also work with allocations and with sensitive data questions
Martin Dahlö (UPPMAX)UPPMAXBioinformatics10Bioinformatic support
Matias Piqueras (UPPMAX)UPPMAXHumanities, Social sciences70Support for humanities and social sciences, machine learning
Mikael Djurfeldt (PDC)PDCNeuroinformatics100
Mirko Myllykoski (HPC2N)HPC2NParallel programming
GPU computing
Parallel programming, HPC, GPU programming, advanced support
Pavlin Mitev (UPPMAX)UPPMAXComputational materials science100
Pedro Ojeda-May (HPC2N)HPC2NMolecular dynamics
Machine learning
Quantum Chemistry
Training, HPC, Quantum Chemistry, Molecular dynamics, R, advanced support
Peter Kjellström (NSC)NSCComputational science100All types of HPC Support.
Peter Münger (NSC)NSCComputational science60Installation and support of MATLAB, Comsol, and Julia.
Rickard Armiento (NSC)NSCComputational materials science40Maintainer of the scientific software environment at NSC.
Szilard PallPDCMolecular dynamics55Algorithms & methods for accelerating molecular dynamics, Parallelization and acceleration of molecular dynamics on modern high performance computing architectures, High performance computing, manycore and heterogeneous architectures, GPU computing
Thomas Svedberg (C3SE)C3SESolid mechanics
Torben Rasmussen (NSC)NSCComputational chemistry100Installation and support of computational chemistry software.
Wei Zhang (NSC)NSCComputational science
Parallel programming
Performance optimisation
code optimization, parallelization.
Weine Olovsson (NSC)NSCComputational materials science90Application support, installation and help
Åke Sandgren (HPC2N)HPC2NComputational science50SGUSI

Links