*Welcome to Software Carpentry

We will use this Etherpad to share links and snippets of code, take notes, ask and answer questions, and whatever else comes to mind.
The page displays a screen with three major parts:




Unix and Linux: Visual Quick 
Start Guide line - http://www.amazon.com/Unix-Linux-Visual-QuickStart-Guide/dp/0321636783/ref=sr_1_1?ie=UTF8&qid=1412084621&sr=8-1&keywords=unix+and+linux+quick+start+guide
Advanced Pandas Guide - http://www.amazon.com/Python-Data-Analysis-Wrangling-IPython/dp/1449319793/ref=sr_1_1?ie=UTF8&qid=1412170843&sr=8-1&keywords=pandas+python
Data Files:  joryschossau.github.io/2014-09-30-msu/data.zip
Leigh's Notebook: http://joryschossau.github.io/2014-09-30-msu/01-numpy-Leigh.ipyn
BASH Cheat Sheet: http://joryschossau.github.io/2014-09-30-msu/BASH_cheat_sheet.txt
Python Tutor: http://www.pythontutor.com/visualize.html
Useful further python lessons at the same level: www.software-carpentry.org/v4/python/index.html


Day 1 IPython notebook (numpy, functions, matplotlib): http://joryschossau.github.io/2014-09-30-msu/data.ipynb
SwC lesson: http://software-carpentry.org/v5/novice/python/01-numpy.html ini

Jory's email: jory@msu.edu
Leigh's email: leighs@msu.edu
Luiz email: irberlui@msu.edu
Josh's email: nahumjos@msu.edu

Software Carpentry website: http://software-carpentry.org
IPython website: http://ipython.org
IPython notebook examples: http://nbviewer.ipython.org/
Visualization of Python executing code: http://pythontutor.com/
Matplotlib gallery, with example code: http://matplotlib.org/gallery.html

Day 2 Pandas notebook: http://joryschossau.github.io/2014-09-30-msu/pandas.ipynb
Pandas SwC lesson: http://software-carpentry.org/v5/intermediate/python/01-intro-python.html

Python Block Execution Visualization: http://www.python-course.eu/images/blocks.png
SWC suggested resources: http://software-carpentry.org/bib/bib.html

Shell commands/programs:
-whoami: current user name
-cd: changes directory - + ".." to move back to parent directory
-pwd: prints current working directory
-ls: list 
-man: manual
-mkdir: make new directory
-rmdir: remove directory
-touch: make new file
-rm: remove
-curl: "cat url" - transfers a url
-unzip: use for .zip files
-du: disk usage
-mv: move
-cp: copy
-cat: concatenate
-wc: "word count" - also can count lines, characters, etc.
-sort
-head: print first n lines
-tail print last 10 lines
-echo
-tr: trim
-uniq
-history
-less: text viewer, similar to cat
-nano: text editor
-clear: clear window

*Command History from BASH Session
*=============================
  507  clear
  508  ;awiefj;aisdf
  509  alksdjlaskfj
  510  cat
  511  while if cat; echo; lksdjfsd
  512  clear
  513  cd
  514  pwd
  515  cd Desktop
  516  pwd
  517  pwd
  518  ls
  519  cd ..
  520  pwd
  521  cd Desktop
  522  cd ..
  523  cd Desktop/cmdlBuild/
  524  pwd
  525  cd ../..
  526  cd Desktop/../Desktop/../Desktop/../
  527  pwd
  528  cd Downloads/
  529  pwd
  530  ls
  531  cd ..
  532  pwd
  533  cd Desktop
  534  clear
  535  ls
  536  ls -l
  537  ls --help
  538  man help
  539  man ls
  540  ls -l ..
  541  clear
  542  man ls
  543  man pwd
  544  man cd
  545  pwd
  546  ls
  547  ls -l
  548  man ls
  549  ls -k -s
  550  ls -ks
  551  ls -ksh
  552  ls -sh
  553  ls -k -s -h
  554  ls -lh
  555  pwd
  556  clear
  557  ls
  558  clear
  559  pwd
  560  mkdir workshop
  561  ls -F
  562  cd workshop/
  563  ls
  564  touch readme.txt
  565  ls
  566  ls -lh
  567  mkdir jory
  568  ls
  569  rmdir jory
  570  ls
  571  mkdir jory
  572  touch jory/readme.txt
  573  ls
  574  cd jory
  575  ls
  576  rm readme.txt 
  577  ls
  578  touch readme.txt
  579  ls
  580  cd ..
  581  ls
  582  ls jory
  583  rmdir jory
  584  rm -r jory
  585  clear
  586  pwd
  587  ls
  588  mkdir user/jory/data
  589  mkdir -p user/jory/data
  590  ls
  591  ls user
  592  ls user/jory/
  593  pwd
  594  ls
  595  rm -r -i user
  596  curl joryschossau.github.io/2014-09-30-msu/data.zip > data.zip
  597  ls
  598  curl msu.edu
  599  man curl
  600  unzip data.zip 
  601  ls
  602  du
  603  man du
  604  du -ha
  605  du -h .
  606  du .
  607  du -s
  608  du -sh
  609  ls
  610* mv data.zip originalData.zip[B
  611  ls
  612  cp originalData.zip modifiedData.zip
  613  ls
  614  ls chem
  615  ls
  616  cp chem backup
  617  cp -r chem backup
  618  ls
  619  du -sh
  620  du -h
  621  cd chem
  622  ls
  623  ls a*
  624  ls a*ia
  625  ls a*ia*
  626  ls ?a*
  627  cat aldrin.pdb 
  628  wc
  629  wc *
  630  man wc
  631  wc -l *
  632  wc -l * > lengths
  633  cat lengths
  634  sort lengths
  635  sort -n lengths
  636  sort -n lengths > lengths-sorted
  637  cat lengths-sorted 
  638  head lengths-sorted 
  639  tail lengths-sorted
  640  head -n 1 lengths-sorted 
  641  sort -n lengths
  642  sort -n lengths | head
  643  wc -l *.pdb | sort -n | tail
  644  wc -l *.pdb | sort -n | tail
  645  echo "some kind of output"
  646  echo "one,two,three"
  647  echo "one,two,three" | cut -d ',' -f2
  648  cat > numbers
  649  cat numbers
  650  cat numbers | cut -d ',' -f2
  651  cat numbers | cut -d ',' -f2-
  652  wc -l *.pdb | sort -n | head
  653  wc -l *.pdb | sort -n | cut -d ' ' | head
  654  wc -l *.pdb | sort -n | cut -d ' ' -f1 | head
  655  wc -l *.pdb | sort -n | tr -s ' ' | head
  656  wc -l *.pdb | sort -n | tr -s ' ' | cut -d ' ' -f1 | head
  657  wc -l *.pdb | sort -n | tr -s ' ' | cut -d ' ' -f2 | head
  658  wc -l *.pdb | sort -n | tr -s ' ' | cut -d ' ' -f2
  659  history
  660  wc -l *.pdb | sort -n | tr -s ' ' | cut -d ' ' -f2
  661  wc -l *.pdb | sort -n | tr -s ' ' | cut -d ' ' -f2 | uniq
  662  wc -l *.pdb | sort -n | tr -s ' ' | cut -d ' ' -f2
  663  wc -l *.pdb | sort -n | tr -s ' ' | cut -d ' ' -f2 | uniq -c
  664  less aldrin.pdb 
  665  nano aldrin.pdb 
  666  history
  667  history | tail -n 166
  668  clear
===============================================


IPython Notebook Session
====================

Useful further python lessons at the same level: www.software-carpentry.org/v4/python/index.html

"print" command useful in python

iPython tries to direct you to where your error may have originated - demarcated with a carrot in the error message

a function requires parentheses - green means it has been completed

str() around a number to have it treated as text rather than numeric

Python discards decimals unless you tell it to - float() provides decimals

Can automatically delimit text as string with hypotheses

Pay attention to the order in which your cells are executed in your notebook

Use '!'  to refer to shell commands if executing within iPython notebook (e.g. !ls to list files in current directory)

Use import command to import modules, such as numpy

Use question mark after command in iPython to see the documentation for the function (can also use help(function_name)) 

Useful data functions
-data.mean()
-indexing: e.g. data[0] - things are 'zero-indexed' in Python, where the "first" item is counted as 0.
-def: used to define a new function

% character 

*

def random_function(data):
    for t in data['temperature']:
        if t > mean:
            print str(t) + ' bigger than mean'
        elif t < mean:
            print str(t) + ' smaller than mean'
        elif t == mean:
            print str(t) + ' equal to mean'
            
def center(data):
    '''Calculates the mean of an array of numbers, then computes the numerical distance of each 
    data point from the mean.
    
    data = numpy.array([[0,1],[0,-1]])
    [[ 0  1]
     [ 0 -1]]

    print center(data)
    [[ 0.  1.]
     [ 0. -1.]]
    '''
    center_data = data - data.mean()
    return center_data


Advanced Command Line
================================

View my command line history here! : http://www.cse.msu.edu/~jory/index.html



  543  cd workshop
  596  clear
  597  pwd
  598  pwd
  599  ls
  600  less lsd.pdb 
  601  grep 'N' lsd.pdb 
  602  grep 'N' *.pdb
  603  grep 'N' lsd.pdb 
  604  grep ' N ' lsd.pdb 
  605  grep -l ' N ' lsd.pdb 
  606  grep -l ' N ' *.pdb
  607  grep -l ' N ' *.pdb > nitrogen_composition
  608  grep ' N ' *.pdb
  609  grep ' N ' lsd.pdb
  610  grep -c ' N ' lsd.pdb
  611  grep -c ' N ' *.pdb
  612  grep -n ' N ' *.pdb
  613  grep ' H ' lsd.pdb 
  614  grep 'H' lsd.pdb 
  615  egrep --color 'H' lsd.pdb 
  616  ls
  617  cat lsd.pdb | grep ' O '
  618  cat lsd.pdb | grep ' O ' | wc -l
  619  cat *.pdb | grep ' O ' | wc -l
  620  man grep
  621  clear
  622  cd ..
  623  cp -r chem backupChem
  624  cd chem
  625  ls
  626  for file in *.pdb; do echo "found $file"; done
  627  man basename
  628  basename -a tnt.pdb 
  629  basename -f tnt.pdb 
  630  basename --help
  631  basename tnt.pdb .pdb
  632  for filename in *.pdb; do basic=$(basename $filename .pdb); mv $filename $basic.txt; done
  633  ls
  634  mv *.txt *.pdb
  635  ls
  636  cd ..
  637  ls
  638  rm -r chem
  639  cp -r backupChem/ chem
  640  cd chem
  641  ls
  642  cd ..
  643  do filename in chem/*.pdb
  644  for filename in chem/*.pdb; do basic=$(basename $filename .pdb); mv $filename $basic.txt; done
  645  ls
  646  basename nerol.txt
  647  basename -f nerol.txt 
  648  basename -a nerol.txt 
  649  dirname nerol.txt 
  650  dirname backupChem/lsd.pdb 
  651  rm *.txt
  652  ls
  653  rm -r chem
  654  cp -r backupChem/ chem
  655  touch rename.sh
  656  nano rename.sh
  657  ls
  658  ls chem
  659  pwd
  660  bash rename.sh 
  661  ls
  662  ls chem
  663  nano rename.sh 
  664  ls
  665  cp -r backupChem/ newMolecules
  666  ls newMolecules/
  667  bash rename.sh newMolecules
  668  ls newMolecules/
  669  nano rename.sh 
  670  cp -r backupChem/ secondMolecules
  671  bash rename.sh secondMolecules
  672  ls
  673  nano commandline.py
  674  python commandline.py Jory
  675  pwd
  676  pwd