Part 2
February 11, 2025
Natalie Gill Bioinformatician II
Run the following commands if you did not attend part 1:
gzip : compresses a file and replaces it with a compressed version (.gz)tar : create and manipulate archive filesArchive: a single file that contains one or more files and/or folders that have been compressed
39M part_2/homo_sapiens.refseq.tsv
3.7M part_2/homo_sapiens.refseq.tsv.gz
total 8
drwx---r--@ 4 nelphick staff 128 Nov 10 15:00 part_1
-rw-r--r-- 1 nelphick staff 801 Nov 10 15:00 part_1.tar.gz
drwxr-xr-x@ 4 nelphick staff 128 Nov 10 15:00 part_2
gunzip -cgene_stable_id transcript_stable_id protein_stable_id xref db_name info_type source_identity xref_identity linkage_type
ENSG00000142611 ENST00000378391 ENSP00000367643 NP_955533 RefSeq_peptide DIRECT 100 100 -
ENSG00000142611 ENST00000378391 ENSP00000367643 NM_199454 RefSeq_mRNA DIRECT 99 62 -
ENSG00000142611 ENST00000270722 ENSP00000270722 NP_071397 RefSeq_peptide DIRECT 100 100 -
ENSG00000142611 ENST00000270722 ENSP00000270722 NM_022114 RefSeq_mRNA DIRECT 100 100 -
ENSG00000232596 ENST00000420522 - NR_147025 RefSeq_ncRNA DIRECT 100 66 -
ENSG00000231510 ENST00000443270 - NR_183671 RefSeq_ncRNA DIRECT 100 64 -
ENSG00000149527 ENST00000449969 ENSP00000397289 NP_001289941 RefSeq_peptide DIRECT 100 100 -
ENSG00000149527 ENST00000449969 ENSP00000397289 NM_001303012 RefSeq_mRNA DIRECT 99 99 -
ENSG00000149527 ENST00000449969 ENSP00000397289 XM_047435038 RefSeq_mRNA_predicted DIRECT 96 92 -
Example:
These can change depending on the specific OS or program, TMPDIR can also be TEMP, TEMPDIR and TMP.
$PATH to find its associated executable file/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/go/bin:/usr/local/mysql/bin
$PATH like this:export PATH="/path/to/new/software:$PATH"
$PATH for the current terminal session~/.bashrc or ~/.zshrc$PATH incorrectly can break system functionality.shnano part_2/example_script.sh
#!/bin/bash
#! tells the OS where the interpreter is-rw-r--r-- 1 nelphick staff 287 Nov 10 15:00 part_2/example_script.sh
#!/bin/bash
# This is a comment. Comments are ignored by the shell.
# $1 is the first argument passed to the script
echo "Counting the genes in $1"
# count the unique genes in the file
u_genes=$(gunzip -c $1 | cut -f 1 | sort -u | wc -l)
echo "There are $u_genes unique genes in $1"
Example:
sed 's/search_string/replace_string/g' input.txt > output.txt
ssh username@remote
username would be your user on the remote server and remote is the hostname or IP address of the remote server or computerscp [options] [source] [destination]
scp /path/to/local/file.txt username@remote:/path/to/remote/directory/
scp username@remote:/path/to/file.txt /path/to/local/directory/
Basic command:
awk options 'pattern {action}' input_file
$1,$2 : the first and second fieldsIntroduction to RNA-Seq Analysis February 13-February 14, 2025 1:00-4:00pm PST
Intermediate RNA-Seq Analysis Using R February 20, 2025 9:00am-12:00pm PST
Introduction to Statistics, Experimental Design and Hypothesis Testing February 24-February 25, 2025 1:00-3:00pm PST