Introduction to

img/python_logo.png

Time:

Type F11-e-c in the projected window

Type F11-e-n in the presenter window

Remember ESC or t for navigation, b for blank

REMEMBER TO START THE TIMER HERE

The speaker

img/davide.jpg

Davide Del Vento, PhD in Physics

User Support Software Engineer - Consulting Services Groups - CISL

http://www2.cisl.ucar.edu/uss/csg

ExtraView Tickets: cislhelp@ucar.edu

Time:

If you don't know me...

Feel free to interrupt me for questions, this is the purpose of me being here

Foreword

  • You must do the hands-on
  • Somebody calls it "the hard way"
  • But it's simple, you look and you do
  • If you thought to be here and listen while checking the emails...
    • ...feel free, but...
    • ...that's not going to work and you will waste your time

Time:

THIS IS HANDS-ON SESSION, for two reasons:

o If you are not engaged you do not really learn anything

o Because of the way I structured the lessons, mixing things together (as they are in real life)

Note: this is the first time I teach this class in present form (I've taught it in different forms), so there are some rough edges. Feedback welcome.

Start!

  • Login into the machine of your choice

  • Load the python and ipython modules, if applicable

  • Check versions with

    • python --version

    and

    • ipython --version
  • Try to start the interpreters

Time:

Who is already familiar with python shell? Like ksh (interactive and script)

What is ipython? Many, many things, the part that we are using here is just an alternative shell, with tab completion and command history preserved among runs

Login into ys 3 and show it on tmux

Our first python program

  • Hello World is for wimps
  • Let's make the UNIX cat utility
  • In a simplified version (without all the options)
  • Reads a text file and prints it on the screen
  • Poll: how many lines of code will this be in python?

Time:

Are you familiar with cat?

Show it on tmux

Loop

for line in open("/etc/fstab"):
    print line,

Teaching points:

  • Loop and indentation and semicolon...
  • ...instead of hard-to-count curly brackets
  • ...or hard-to-track keyworks (e.g. end, done, fi)
  • human oriented vs machine oriented, i.e.
    • more English-looking
    • loops on "iterators" not integers
  • simple to read (if properly written)

Time:

Python is easy (and fast) to write and read

Note my hard-way of teaching which mixes syntax, semantics, concepts in a real-world example

No way you're gonna follow me if you don't type

A note about the editor

  • You can be extremely fast if using a multiline, space-aware editor
  • Or just using the VIM/EMACS python bindings
  • I am not doing that, so I don't go too fast

Time:

Don't forget to stop me if I go too fast anyway

or if you have questions

Our second program

  • Python is very concise...
  • ...but our two-line is kind of a cheat
  • One may want to show the content of any file...
  • ...not just /etc/fstab
  • Philosophy: explicit is better than implicit...

Time:

Imagine writing this 2-line cat program in C, FORTRAN, java, or your other favorite language

Arguments are better than hard-coded things

Show it on tmux

Arguments

import argparse
parser = argparse.ArgumentParser()
parser.add_argument("file",
                    type=argparse.FileType('r'),
                    help="prints the content of the file")
args=parser.parse_args()

for line in args.file:
    print line,

Teaching points:

  • import
  • dot-notation to access classes attributes, i.e. .
  • Parenthesis are used for functions
  • Multiline can be used freely inside parenthesis

Time:

IMPORT: just make "something" (a MODULE) available from a library installed on the system

DOT: if you know OOP is what you expect - otherwise, it just "look inside" the part "on the left" of the dot, searching for the part "on the right" of the dot

E.G.: on second line, ArguentParser() is a "function" inside the argparse "module"

Arguments

import argparse
parser = argparse.ArgumentParser()
parser.add_argument("file",
                    type=argparse.FileType('r'),
                    help="prints the content of the file")
args=parser.parse_args()

for line in args.file:
    print line,

Teaching points:

  • Argparse is magic (and supercool)
  • Automatically create helps, checks (e.g. integer conversion, file exists -- and provides error messages if not)
  • It is a 2.7 feature, but is backported to 2.6

Time:

Anybody interested on how to install argparse in python 2.6?

Or how to install anything, FWIW?

Another program

  • Let's write the head command
  • Which basically is like cat but stops after n lines are printed

Time:

There are many ways of doing this. Some ways are lame. I don't claim that my way is best, but for sure it's not the worst

Show it on tmux

If Clause (and more)

import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--lines",
                    type = int,
                    default = 10)
parser.add_argument("file",
                    type = argparse.FileType('r'),
                    help = "prints the first LINES lines of the file")
args = parser.parse_args()

for n,line in enumerate(args.file):
    # n starts from zero, no need to print
    if n == args.lines:
        break
    print line,
  • Syntax of the if (note the colon)
  • Break statement
  • Syntax of comments (hash #)
  • The enumerate built-in
  • The double assignment

Time:

Describe if, break and comments

Let's skip for a moment on the enumerate built-in

And explore the double assignment

Show it on tmux

Next program

  • Prints the n-th fibonacci number
  • The Fibonacci numbers are a mathematical series where
    • The first number is 0
    • The second number is 1
    • The next numbers are the sum of the previous 2
    • Namely: 0, 1, 1, 2, 3, 5, 8, 13, 21, ...

Time:

Show it on tmux

Functions

import argparse
parser = argparse.ArgumentParser()
parser.add_argument("n",
                    type = int,
                    help = "prints the N-th Fibonacci number")
args = parser.parse_args()

def fibonacci(n):           # this is function definition
    a,b = 0,1
    for i in range(n):
        a,b = b,a+b
    return a

print fibonacci(args.n)     # this is function invocation
  • Did you expect anything different from functions?
  • Actually there is a trick: they can return multiple values

Time:

note also the "double" assignment!

Show the trick on tmux

Data structures (hands-on)

  • Numbers
  • Strings
  • Lists
  • List Comprehension
  • Tuples
  • Mutable vs Immutable data structures
  • Dictionaries

Time:

Let's explore these more

This is probably the most important slide of this first session. If you have been sleeping, wake up. If you have been checking emails, stop and

do the hands on

o Show it on tmux

More data structures

  • Numbers can be complex and have arbitrary precision
  • Stack and Queues (just lists, they can easily be used for this purpose)
  • Sets (similar to lists, but order is irrelevant and duplicates are disallowed)
  • Many others...

Time:

Will not show these, maybe just the arbitrary arithmetic and complex numbers?

Classes

  • Not an OOP class
  • So for our simplified interests a class is something which contains
    • data structures (think Fortran derived types or C's struct)
      • called fields when embedded in a class
    • functions
      • called methods when embedded in a class

Time:

The shortest OOP class was week-long and can easily be semester-long

Classes (2)

  • We've already seen classes, for example strings
  • particular occurrences of a string are called instances
    • for example "banana" is an occurrence of a str
  • We've seen methods in the string class
    • for example upper() is a method

Time:

But also lists, tuples, dictionaries are all classes

Show it in tmux

Classes in action

  • Let's make a Object-oriented version of fib.py
  • In other words, rewrite it using classes

Time:

Show it in tmux

The code

import argparse
parser = argparse.ArgumentParser()
parser.add_argument("n",
                    type = int,
                    help = "prints the N-th Fibonacci number")
args = parser.parse_args()

class MyMath(object):           # this is how to make a class
    def fibonacci(self, n):     # note the use of 'self' argument
        a,b = 0,1
        for i in range(n):
            a,b = b,a+b
        return a,b

x,y = MyMath().fibonacci(args.n)
print "x=", x
print "y=", y

Teaching points

  • This class has has just one method and no fields
  • Note the use of the indentation (love it or hate it)

Time:

A class is defined with the class keyword

A class contains methods and fields

What's the point for classes? Organize the code more neatly (imagine having hundreds of methods) and simplify other things

Digression on Testing

Question:

  • What is it an automated test?

Answer:

  • It's a piece of code that tests the behavior of your program

Question:

  • How many tests I should write?

Answer:

  • Enough, according to your strategy

Time:

Testing Strategies

  • White box testing
    • write a test for any edge cases, plus
    • a couple of internal cases
  • Black box testing (and Test Driven Development)
    • write as many tests are necessary to define what your code is supposed to do

Time:

Case study: overlapping rectangles

img/overlapping_1.png
  • Suppose you have to write code to check if two rectangles are overlapping
  • Maybe if they do your atmosphere code has to exchange data with ocean model

Time:

Case study: overlapping rectangles

  • Before you write any code, think of your overlapping method as a black box
  • What is supposed to do for all the cases like these?
  • Write tests that define what is your expected output
  • Then write code to satisfy these tests
img/overlapping_2.png

Time:

Case study: overlapping rectangles

  • Try to think of edge cases
  • Do not write any code, unless you have a test that fails showing why you need to write that code
  • For example you want to define if these rectangles overlap
img/overlapping_3.png

Time:

Testing (bottom line)

  • Test what a piece of code is supposed to do
  • Not what it does
  • Not how it does it

Time:

This is THE most important piece of information about testing

Hands-on: writing a product function

  • Suppose that our computer does not have the product function
  • Let's implement it
  • As repeated summation (for integers)

Time:

I know it does, but I want a simple enough example. Suppose I say uour library does not have the particular integration method you want, and you are implementing that

Show it on tmux

Hands-on: maximal sublist problem

  • Given a 1-dimensional list of floats
  • e.g. [-34, 1, 0, -2, 123, -83]
  • Find a sublist with the largest sum
  • e.g. [ 1, 0, -2, 123 ]
  • The sublist must be contiguous elements
  • i.e. the -2 cannot be skipped
  • but -34 and -83 can

Time:

A little artificial problem, but useful to prove Unit Testing in practice

Don't say anything about ties just yet

Hands-on: maximal sublist problem

  • Any idea on how to solve it?

Time:

Don't say anything about ties just yet

Break?

img/coffee.png

Time:

Think about it while taking a little break

There is restroom and fountain right out of the door and cafeteria is upstairs

Hands-on: maximal sublist problem

  • Any idea on how to solve it?
  • One idea: check all the possible sublists
  • Looping from all the possible starts to all the possible ends

Time:

Show it on tmux

A better idea: divide and conquer

  • Divide the list in two sublists, the maximal sublist must be in one of the following:
  • Entirely in the left sublist
  • Entirely in the right sublist
  • Part in the left sublist, part in the right one
  • Solve the 3 cases independently and return the one with the maximum sum

Time:

Show it on tmux

More on parsing files

  • Easy to loop as in the cat/head example...
  • ...and split strings according to some rules
  • But even easier with csv module

Time:

Show it on tmux

Regular Expressions

  • Anybody used grep/awk/sed?

Time:

Regular Expressions

  • Anybody used grep/awk/sed?
  • Write a regular expression to replace all occurrences of (xxx,yyy) with [xxx][yyy]

Time:

Regular Expressions

  • Anybody used grep/awk/sed?
  • Write a regular expression to replace all occurrences of (xxx,yyy) with [xxx][yyy]
  • Solution is trivial, just:
"""s/(\(.\{-}\),\(.\{-}\))/[\1][\2]/g"""

Time:

Regular Expressions

  • Anybody used grep/awk/sed?
  • Write a regular expression to replace all occurrences of (xxx,yyy) with [xxx][yyy]
  • Solution is trivial, just:
"""s/(\(.\{-}\),\(.\{-}\))/[\1][\2]/g"""
  • Try to debug or modify that a year down the road....

Time:

Regular Expressions in Python

import re
pattern = r"""\(       # match the open parenthesis
              (           # group subpattern and capture the content
              .+?          # match one or more times, but as few as possible
                           # this will be captured similarly to \1
              )           # end of capture
              ,           # just a comma
              (           # group subpattern and capture the content
              .+?          # match one or more times, but as few as possible
                           # this will be captured similarly to \2
              )           # end of capture
              \)       # just the closed parenthesis"""

regex = re.compile(pattern, re.VERBOSE)

result = regex.sub(r"[\1][\2]", text)

Time:

Duck typing

img/duck.png

Time:

Show it in tmux

Strongly typed?

def mysum(a,b):
    return a+b

print mysum(1,2)
print mysum("a", "b")
  • Looks like the type does not matter like in shell script, does it?

Time:

Strongly typed?

def mysum(a,b):
    return a+b

print mysum(1,2)
print mysum("a", "b")
print mysum(1, "a")
  • Would that work?

Time:

Show it in tmux

Invoking external programs

  • Invoking external programs

Time:

Show it in tmux

Environmental variables

  • Getting/Setting environmental variables

Time:

Show it in tmux

Using python instead of shell scripts

  • Plumbum

Time:

Show it in tmux

References

|_| img/think_python.png |_| img/hard_way.jpg |_| img/johnny.jpg

Time:

Think Python - Free as in Freedom Book (you can use for what you want)

Lunch?

img/hamburger.png

Time:

Let's take a lunch break and be back in one hour

License/Credits

  • This material is released under the

    Creative Commons Attribution-NonCommercial 3.0 Unported License

    http://creativecommons.org/licenses/by-nc/3.0/

    In other words, permission is granted to copy, distribute, and/or modify this document under the terms of the license

  • Some parts are taken from the Think Python book referenced in the previous slide

Time: