Introducing
Your new presentation assistant.
Refine, enhance, and tailor your content, source relevant images, and edit visuals quicker than ever before.
Trending searches
Everything is an object.
Yes, objects are everything.
Passed by reference
Python's OO nature not always obvious.
Python:
for i in range(5):
print i
len("hello") # => 5
Ruby:
(0..4).each {|num| puts num}
"hello".length
No type annotations. Values have types, variables do not.
Values are strongly typed.
s = "hello"
n = 99
s += "world" # ok
s += n # error
Python uses structural typing, java uses nominal typing.
Structural typing is also called "duck typing" - if it walks like a duck
and quacks like a duck it is a duck.
As such interfaces are free and implicit in python.
Data literal syntax support for lists, tuples and dicts (java maps).
Java7?
Heterogenous, though most of the time only tuples are heterogenous.
Java does heterogenous data structures of course but loses it's static typing:
Map<Object, Object>
List<Object>
Lists are a basic mutable sequence.
myList = [] # empty list
myList = ['abc'] # singleton list with a String
myList = [1, "b", [3, 4, 5]] # number, String, sublist
Tuples are the basic immutable sequence
myTuple = () # empty tuple
myTuple = ('abc',) # singleton tuple with a string
myTuple = (1, 'b', [3,4,5]) # immutable sequence
Dictionaries - basic associative datatype. Keys often immutable, must be hashable.
myDict = { }
myDict = { 1 : "a" }
myDict = { 1 : "a", 2 : "b", "foo": 99 }
Values you can store, load, pass around and return from functions.
No need to wrap a 'function' in an object, no ceremony - easy to type.
# A python list of String (str) numbers
myList = ["10", "1", "20", "11", "21", "12"]
// Java could sort it like so
Collections.sort(myList, new Comparator<String>() {
public int compare(String o1, String o2) {
return Integer.valueOf(o1).compareTo(Integer.valueOf(o2));
}
});
# python
def sortFunction(a, b):
return cmp(int(a), int(b))
myList.sort(sortFunction)
#or more directly
myList.sort(lambda a, b: cmp(int(a), int(b)))
import org.python.core.*;
import org.python.util.PythonInterpreter;
public class GreenHouseController {
public static void main(String[] args) throws PyException {
PythonInterpreter interp = new PythonInterpreter();
System.out.println("Loading GreenHouse Language");
interp.execfile("GreenHouseLanguage.py");
}
}
It's also transparent to import java classes from python, or write classes in jython that can be used by java.
Integrated Development Environments
You can go a long way with a decent text editor like text pad, emacs, notepad++ and a shell with completion, debugging and OS shell features like IPython.
See http://showmedo.com/videotutorials/series?name=PythonIPythonSeries
For all your IDE needs check out these links:
http://wiki.python.org/moin/IntegratedDevelopmentEnvironments
http://stackoverflow.com/questions/81584
Plenty of decent free options with PyDev, an eclipse extension perhaps good for java devs.
Good commercial IDEs:
Komodo
Wing from WingWare.
Hello World 2
Start up your shell, at the prompt >>>
print 'hello, world!'
prints the obvious to the console.
Note how single and double quotes are the same in python no character vs string quotes distinction.
Now if we put the hello.py file from hello world 1 in the same directory that the shell starts in then at the prompt:
>>> import hello
prints 'hello, world' as a side effect to stdout.
The variable hello now references a module object. Even though there's no reusable code or variables in there the module object has some properties, e.g:
hello.__name__ # 'hello'
hello.__file__ # 'hello.py'
hello.__package__ # None
Where's my java doc?
Java has a single html output centric system for marking up code doc
strings. Python prefers to be able to deal with multiple input and output
formats so doesn't commit to a single doc markup strategy.
Python has pydoc in the standard library which can analyse the code
in your package and produce a html page from it. It's not universally
loved. I'd probably go with epydoc if ever required to have a strict markup
format instead of 'just document it well'.
For the state of python doc markup projects check out:
http://epydoc.sourceforge.net/relatedprojects.html
This a documentation string. If a string is found before everything but a comment
it is a documentation string. This applies to modules (at the top of a file), classes,
functions and methods.
Documentation is available live on an object.
import hello
print hello.__doc__
For the double underscore averse, this does the same as x.__doc__:
from inspect import getdoc
getdoc(x)
See http://www.python.org/dev/peps/pep-0257/ for doc string conventions. Most python
coding conventions are defined in peps (python enhancement proposals).
http://www.python.org/dev/peps/pep-0008/ is the style guide -> saves writing your own.
For the really import curious...
When Python imports a module, it first checks the module registry (sys.modules) to see if the module is already imported. If that’s the case, Python uses the existing module object as is.
Otherwise, Python does something like this:
1. Create a new, empty module object (this is essentially a dictionary)
2. Insert that module object in the sys.modules dictionary
3. Load the module code object (if necessary, compile the module first)
4. Execute the module code object in the new module’s namespace. All variables assigned by the code will be available via the module object.
The variable argv refers to the attribute argv (a list) in the system module.
The interpreter will load the sys module if required.
It will not be reloaded.
Note you can use reload function to reinterpret the source code.
What's with the triple quotes?
Triple quotes provide a multiline string that
means you don't need to add \n to start a newline
and you don't have to concatenate multiple strings'
lines with the line continuation character:
s = "foo \n" + \
"bar \n" + \
"baz"
Note many times (function parameters, lists, dicts, tuples etc.) you
don't need \ at the end of a line.
Whether you need multiple lines or not """doc...""" is the standard
for specifying a documentation string.
__name__?
Remember that property of hello that we looked at in the shell.
When imported from another module hello.py has the __name__ property value 'hello'.
A special case is when the file is ran as 'main' by the interpreter e.g:
python hello.py (or double click hello.py etc)
Then the module has the __name__ property '__main__'. So we can do different things depending on
whether we are ran as a script or imported as a library.
Ternary Operator
x if predicate else y
This is python's version of the java/C++ ternary operator
predicate ? x : y
Older python code (pre python2.5) had to use the 'and or' trick
which wasn't 100% safe, avoid it now.
predicate and x or y
Hello World Library and Program
A new hello.py file:
"""hello
Provides functions for saying hi.
"""
from sys import argv
def hey(name):
"""Say hello to name three times."""
print "hello, %s!" % (name)
print "hello, {0}!".format(name)
print "hello, {target}!".format(target=name)
if __name__ == "__main__":
hey(argv[1]) if len(argv) > 1 else hey("world")
Ok, there is no build.
Python compiles your code to bytecode as needed putting the bytecode in a .pyc (python compiled) file.
At import time, if .py file is newer than the .pyc file.
.py -> .pyc
Python is interpreted, VM based - it's not compiled in the native sense, though if you have a package using a C extension that package is probably python release and OS specific.
u"U what?"
There's a couple of string prefixes you might come across.
u"A unicode string."
r"A raw string."
ur"A raw unicode string."
Raw means special backslashes etc are substituted in.
r'\n' is not a newline.
r"foo \t bar" has no tab in it, but "foo \t bar" does.
r"C:\foo\bar\baz" == "C:\\foo\\bar\\baz"
Handy for clearer regular expressions!
Naturally you can use these with multiline strings aswell.
foo = ur"""Hey \t ho!
Howdy Pardner!
"""
bar = u"Hey \\t ho!\nHowdy Pardner!\n"
foo == bar # True
Unicode string prefixes maybe redundant in python3 where unicode is done cleanly and is the default for strings. I need to check one day!
['',
'C:\\Python26\\scripts',
'C:\\Python26\\lib\\site-packages\\docutils-0.6-py2.6.egg',
'C:\\Python26\\lib\\site-packages\\webunit-1.3.8-py2.6.egg',
'C:\\Python26\\lib\\site-packages\\functional-0.7.0-py2.6.egg',
'C:\\Python26\\lib\\site-packages\\funkload-1.11.0-py2.6.egg',
'C:\\Python26\\lib\\site-packages\\pylint-0.19.0-py2.6.egg',
'C:\\Python26\\lib\\site-packages\\logilab_astng-0.19.3-py2.6.egg',
'C:\\Python26\\lib\\site-packages\\logilab_common-0.46.0-py2.6.egg',
'C:\\Python26\\lib\\site-packages\\pydbgr-0.1.3-py2.6.egg',
'C:\\Python26\\lib\\site-packages\\tracer-0.2.3-py2.6.egg',
'C:\\Python26\\lib\\site-packages\\pyficache-0.1.3-py2.6.egg',
'C:\\Python26\\lib\\site-packages\\import_relative-0.1.0-py2.6.egg',
'C:\\Python26\\lib\\site-packages\\columnize-0.3.2-py2.6.egg',
'C:\\Python26\\lib\\site-packages\\coverage-2.85-py2.6.egg',
'C:\\Python26\\lib\\site-packages\\typecheck-0.3.5-py2.6.egg',
'C:\\WINDOWS\\system32\\python26.zip',
'C:\\Python26\\DLLs',
'C:\\Python26\\lib',
'C:\\Python26\\lib\\plat-win',
'C:\\Python26\\lib\\lib-tk',
'C:\\Python26',
'C:\\Python26\\lib\\site-packages',
'c:\\python26\\lib\\site-packages',
'C:\\Python26\\lib\\site-packages\\IPython/Extensions',
u'C:\\Documents and Settings\\wardk\\My Documents\\emacs\\_ipython']
For the really curious, sys.path is initialised by the module python/lib/site.py
at interpreter initialisation time.
ImportError! Python can't find my file!
You can always import a file or package in the current working directory. It's the first place checked & it's easy enough to move around:
import os
print os.getcwd()
os.chdir("...")
Otherwise sys.path is the magic attribute that determines if python can find your module. Similar to java's CLASSPATH.
sys.path can be changed dynamically with ease (it's just a list).
python-install-dir/lib/site-packages is where all your installed
packages go.
Other ways to add to the system path:
PYTHONPATH environment variable - same syntax as os PATH.
Make a foo.pth path file, put it somewhere in the existing sys.path. Each line has an entry to append to sys.path.
Python supports automatic unpacking of sequence types into an equivalent amount of variables. (A sequence type is anything that supports iteration).
t = (1, 2, 3)
a, b, c = t # a=1, b=2, c=3
a, b = t # error too few variables
a, b, c, d = t # error too many variables
When iterating over a dictionary's items each element is a pair (2 tuple) containing the key and value.
for k,v in someDictionary.items():
....
Functions 101
def f(a,b): ...
def f(a,b,c): ...
def f(a, b, c=None): ...
Defaults are calculated once and stored with the function definition code.
If it's a mutable value you could change it and see a different value across function invocations.
Pass what?
Python uses whitespace indentation for marking out blocks so you need some keyword i.e. 'pass' to indicate a no operation for when marking out stubs etc.
The following is is a common idiom for defining a general module scoped exception class.
class Error(Exception):
pass
Calling Functions
You can name the parameters when you call a function. In this case you are said to be using 'keyword' or 'named' parameters. These must go after unnamed or 'positional' arguments.
As named parameters are explicit, you can give them in any order!
def foo(a, b, c=None, d=1, e=2):
pass
foo(99, 100, e=5, c=101, d=9) # ok!
foo(1, d=99, b=44) # ok!
So what?
Using named/keyword parameters in function calls is clearer.
It's safer when multiple parameters have the same type.
No Builder design pattern needed to simplify complex constructors.
Functions 301, Extra Bits
def h(a, b, c=3, *args, **kwargs):
print "PosArgs"
for posarg in args:
print posarg
print "KWArgs"
for key, value in kwargs.items():
print key, value
h(1, 2, 3, 'a', 'b', foo=99, bar=666) # outputs =>
PosArgs
a
b
KWArgs
foo 99
bar 666
Additional unspecified parameters (var args in C++/Java) can be given.
A good example of this in use is the dict contructor (creates a dictionary object). The following are equivalent:
dict(a=1, b=2)
{'a' : 1, 'b' : 2}
* and ** in function calls
When used in making a function call, not defining a function:
* can unpack a tuple
** can unpack a dict
E.g. these are equivalent:
dict(a=1, b=2)
dict(** {'a' : 1, 'b' : 2})
foo in bar compiled actually expands to:
bar.__contains__(foo)
object.__contains__ is one of many methods to override.
The 'in' operator
for ... in ...
This does generic sequence iteration, like Java's ForEach and is unrelated to the 'in' operator.
>>> "hello" in "hello, world"
True
Is one object inside another object? What this means varies from object to object.
Strings : is one string the substring of another
lists : is the object an element of the list
dict : is the key found in the dict
Executable Pseudocode
A reasonable claim indeed. Use of whitespace indentation for blocks and some 'wordy' operators contribute to this idea.
Python vs Java:
True true
False false
not !
and &&
or ||
in
not in
!= !equals
== equals
is == 'is' uses object identity via id() function
is not !=
Same:
+, -, *, <, >, <=, >=
>>, <<, &, ^, |, ~
Operator Overloading
See all the methods that can be overriden on the base object class
like operators __mul__, __sub__ and many others at:
http://docs.python.org/reference/datamodel.html#basic-customization
One interesting one is multiplication of a string:
" " * 80 # => new string with 80 spaces
Quick warning! Unlike java, python basically only has local and global scopes.
There is no block scope in python.
for thing in thingList:
....
thing # ok, the last thing in thingList is still in scope.
try:
xmldoc = parsexml(stuff)
except: # any exception
xmldoc = None
# now use xmldoc
To mutate module globals you need to use the 'global' keyword.
foo.py:
thing = ...
def f():
global thing # thing is a new local reference without this!
thing = somethingelse
Object Hierarchy
class object is the object hierachy base class.
In python3 all class are derived from object without having to specify it. In
python 2.x there are some subtle, not so interesting differences between:
class MyClass(object): pass
class MyClass: pass
The object derived classes are 'new style' classes (well new since python 2.2).
Old style 'classic classes' cannot do everything new ones can, so just make
yourself a new style one to be safe.
Quick Roundup
class MyClass(object):
"""Documentation for MyClass"""
someStaticMemberVar = "you"
def __init__(self):
"""Initialiser. Java constructor."""
super(MyClass, self).__init__()
self.publicThing = 1
self._privateThing = 2
self.__nameMangledPrivateThing = 3
def greet(self, name):
"""Prints a greeting to name"""
print "hey {name}!".format(name=name)
c = MyClass()
c.greet("bud") # -> "hey bud!"
c.greet(MyClass.someStaticMemberVar) # "hey you!"
c.publicThing # => 1
c._privateThing # => 2
c.__nameMangledPrivateThing # => NameError exception
c.MYCLASS_nameMangledPrivateThing # => 3
ABCs (Abstract base classes)
Languages with multiple inheritance tend to use ABCs instead of interfaces - python won't ever get 'interfaces' - ABCs are interfaces effectively.
An ABC lets inspectors of classes determine if a class meets a protocol.
Python is still very much a duck typed language, e.g. many functions take file like objects (have read / write) methods.
ABCs are new (python2.6) and only used where introspection is clumsy - e.g. used with the numeric tower in python3.
Very similar to java, minus the checked exceptions given it's dynamic nature.
Python vs Java
raise throw
try try
except catch
finally finally
class Error(Exception):
"""General Exception particular to this module. Non system exiting
exceptions are derived from Exception. See the standard exception
hierarchy at: http://docs.python.org/library/exceptions.html.
"""
pass
raise Error("something bad happened")
try:
...
except (ExceptionType1, ExceptionType2), e:
....
except Exception, e: # all user exceptions and standard exceptions
....
raise # by itself rethrows
except KeyboardInterrupt, e:
....
except BaseException, e: # everything ultimately derived from this
....
finally: # usual clean up block, always called.
pass
Lists have no obvious mutating method on them that can replace all the values in the someList parameter.
Actually, there is __setslice__, which will make sense later:
def foo(someList):
someList[:] = range(10, 100, 2) # mutates the parameter
Immutable:
Strings, numbers & tuples
Mutable:
Lists
Most other objects
Assigning a new value to a function parameter is just going to create a new local reference!
def foo(someList):
someList = range(10, 100, 2) # a local ref to a new list
Java has an immutable string class, which leads to the StringBuilder class to
efficiently construct big strings instead of using string += otherString repeatedly.
Python's equivalent idiom is:
delimiterString.join( stringSequence )
" ".join( ("foo", "bar", "baz") ) # => "foo bar baz"
Const? Final?
Python has no keyword to set immutability of a reference or an object's state.
Any class can be made immutable by overriding attribute setting:
class Immutable(object):
"""An immutable class with a single attribute 'value'."""
def __init__(self, value):
# we can no longer use self.value = value to store the instance data
# so we must explicitly call the superclass
super(Immutable, self).__setattr__('value', value)
def __setattr__(self, *args):
raise TypeError("can't modify immutable instance")
__delattr__ = __setattr__
In Java the folder hierarchy matches the package name hierarchy.
Python does the same.
Java adds the declaration to each file:
package foo.bar.baz;
Python adds nothing to the files but instead adds a file
__init__.py
to a folder to make it a package (or subpackage).
Often __init__.py will only have module (now package) documentation
string, if anything.
There's a couple of special variables that are recognised in __init__.py.
__all__ # specifies what "from package import *" does
__path__ # never used it, who knows!
The python language is named after the classic British Comedy Monty Python!
Nothing to do with snakes!
Cheese Shop sketch at:
http://www.youtube.com/watch?v=B3KBuQHHKx0
Finding Packages
Check out the python Cheese Shop!
More formally known as the python package index "PyPI".
http://pypi.python.org/pypi
This is a central repository (somewhat like Maven central in Java) with, as of the time of writing, 8853 packages.
More on Eggs
http://stackoverflow.com/questions/47953/what-are-the-advantages-of-packaging-your-python-library-application-as-an-egg-f
http://www.ibm.com/developerworks/library/l-cppeak3.html
Installing Packages
python setup.py install
Installs the package to lib/site-packages/package-name.
This builtin installation method requires you to manually delete packages - there is no uninstall command.
easy_install.exe package-name
Downloads from the web and pulls in dependencies if the package sets up the correct metadata.
May use python egg files, a binary installation format that work with easy install. Still no uninstall command :(
Eggs are to python as jars are to java. More likely used if there's C extensions involved. Can be 'installed' just by placing in your sys.path.
The Python packaging situation is not entirely satisfactory, much work goes on at the moment. See "A history of python packaging" for more info:
http://faassen.n--tree.net/blog/view/weblog/2009/11/09/0
Simple solution, often good enough:
class RGBColor:
Red, Green, Blue = range(3)
x = RGBColor.Red
Not fully type safe. A fully type safe version would be more tedious and verbose,
which is where those 3rd party libraries come in.
There is an option to start the python interpreter with optimisations turned on, though they don't have
that big an effect.
With the python.exe -O flag, __debug__ is False by default.
.pyo files are used instead of .pyc files.
See http://docs.python.org/tutorial/modules.html for details.
If there really is an important speed bottleneck in a python program a C extension
maybe considered.
There's a standard xunit unit testing framework for python:
import unittest
class MyTest(unittest.TestCase):
def setup(self): ...
def tearDown(self): ...
def testBlah(self): ...
def testFoo(self): ...
if __name__ == '__main__':
unittest.main()
assert is a python keyword.
assert expression, "you suck"
The assert keyword expands the code to:
if __debug__: # True by default, immutable.
if not expression: raise AssertionError, "you suck"
Code Checkers
There are a few tools to do analysis of your code for likely bugs or poor style.
pep8 - checks your code against the style conventions in pep8
pylint, pyflakes, pychecker - all check your code in various ways, with pylint probably being the most solid offering. Some will just analyse the source, others, like pylint, will import your code. General code standard and signs of bugs checking.
Some IDEs, like pyscripter, have shortcuts to run pylint on the code in your current file or project. The checker program often needs installing first though.
Indexing
ns = range(10) # [0..9]
Standard indexing:
ns[0] # 0
ns[1] # 1
Negative indexing, reverse order:
ns[-0] # 0, same as [0]
ns[-1] # 9
ns[-2] # 8
ns[-10] # 0
Define your own indexable objects:
object.__setitem__
object.__getitem__
Slicing
Any ordered sequence or object that implements the slicing methods. Strings, lists, tuples...
ns = range(10) # [0..9]
Slices:
ns[ : 3] # [0,1,2]
ns[0 : 3] # [0,1,2]
ns[3 : 6] # [3,4,5]
ns[3 : ] # [3..9]
ns[3 : 0] # [ ]
ns[4 : -2] # [4,5,6,7]
ns[-4 : -2] # [6,7]
ns[4 : 2] # [ ] non sensical range
ns[-2: -4] # [ ] non sensical range
Stepping:
ns[ : : 2] #[0,2,4,6,8]
ns[0 : -1 : 2] #[0,2,4,6,8]
ns[ : : 8] #[0,8]
ns[ : : 99] #[0]
ns[5 : : 2] #[5,7,9]
ns[ : : -1] #[9,8...1,0] steps over the sequence in reverse
ns[ : -1 : -2] #[ ] mixing negative steps and range specs doesn't work.
Slice assignment:
ns[2:4] = (22,33,44) # ns = [0,1,22,33,44,5...9]
ns[5:7] = 55,66,77 # ns = [0,1,22,33,44,55,66,77,8,9]
ns[:] = (1,) # ns = [1]
Resource Management 1
class MyClass(object):
"""Doc ..."""
def __init__(self, useDB=True):
if useDB:
self._dbconnection = # ... getDBCon
self._dbconnClosed = not useDB
def close(self):
if not self._dbconnClosed:
self._dbconnection.close()
self._dbconnClosed = True
c = MyClass()
try:
# ... use c
finally:
c.close() # tedious and error prone
del
del is a keyword ... that doesn't ever call __del__on an object.
def func():
c = MyClass()
# ...
del c
c # NameError exception c is not defined
As we see it unbinds the variable, so that at some unspecified time from now the garbage collector can collect what c once referenced. At that point __del__ will be ran on that object.
I've rarely seem del. At the shell/repl more likely to reassign to the reference variable.
Resource Management 2
More popular than java finalisers, but not 100% safe...
def MyClass(object):
...
def __del__(self):
"""Finaliser. Often used as a C++ destructor, though don't
recommend relying upon it. Circular refs between 2 objects
with del methods can screw up GC.
del is not the opposite of init, this can run even if init raises
an Exception
"""
self.close()
c = MyClass(useDB=True)
Later c goes out of scope and the undeterministic garbage collection usually triggers the __del__ method.
Resource Management 3
def MyClass(object):
....
def __enter__(self):
"""Context manager support, return self to be bound to the variable in
the 'as' part of a 'with...as' statement.
"""
return self
def __exit__(self, exc_type, exc_val, exc_tb):
"""Context manager support, close the db connection when the with...as
block exits.
"""
self.close()
MyClass is now a context manager. The new in python2.6 keywords 'with' and 'as'
manage contexts:
with MyClass() as c: # __enter__ called by with returning the resource to manage
... # use c
Block goes out of scope and __exit__ is called closing the dbconn.
Similarly:
c = MyClass()
with c:
... # use c
Block goes out of scope and __exit__ is called closing the dbconn.
Example common standard context managers are files and locks.
with open("filename.txt") as f:
... # use f
lock = threading.Lock()
...
with lock: # __enter__ calls lock.acquire()
...
# __exit__ calls lock.release()
Nested Functions (or Classes)
Just to point out that they exist really, which is another tool in your encapsulation arsenal. I've often nested functions where only the outer function needs the inner function(s) to get work done.
They are also the visible part of using closures...that is functions that capture their lexical environment, bundling functionality and data.
def make_adder(n):
def add(x):
return n + x
return add
add5 = make_adder(5) # first class functions in action
add5(10) # => 15
Closures is more for a talk about functional programming. Classes can do a similar job to closures, sometimes better, often just different:
class AdderMaker(object):
def __init__(self, n):
self.n = n
def __call__(self, x):
return self.n + x
__call__ is the operator()() of C++, the function call operator. In java it would be the Callable interface.
Lambda
Named after the lambda calculus, the lambda keyword defines an anonymous function at runtime. There's no return statement and it consists of a single expression with no statements. Fans of functional programming may get annoyed with the one expression limit, keeps it simple though.
These are equivalent:
plus2 = lambda n: n + 2
def plus2(n): return n + 2
As we normally use it because we don't want to name the function it's often seen directly inside a function call:
map(lambda n: n * 2, [1,2,3]) # => [2,4,6]
sorted(["10", "1", "20", "11", "21", "12"],
lambda a, b: cmp(int(a), int(b))) # => ["1", "10", "11", "12", "20", "21"]
map is a higher order function that transforms a sequence to a new list with the given function.
What is lazyness as an evaluation strategy?
def square(n):
return n * n
square(1 + 2) # => 9
In a language using "call by value" evaluation, the expression 1 + 2
is evaluated to the value 3 before being given to square. This is
eager evaluation at work.
In a "call by name" evaluation strategy a reference to the expression
(1 + 2) is passed in.
Naively done this means that square(n) expands to:
(1 + 2) * (1 + 2).
Lazy evaluation = structural sharing of expressions + call by name.
An expression shouldn't be evaluated twice.
Lazyness is good
As all good programmers know!
Here's a piece of code from haskell, a fully lazy pure functional language defining an infinite list of natural numbers then taking a slice of the list, the first 5 numbers.
nats = [1..]
take 5 nats # [1,2,3,4,5]. In python itertools.islice(nats, 5)
nats # now that would blow up
Benefits of lazy evaluation
further modularise code. No buffering concerns!
Python, like other mainstream languages uses eager/strict evaluation, but has
lazy support in the form of generators.
The depths of for ... in
_iter = iter(obj) # Get iterator object
while 1:
try:
x = _iter.next() # Get next item
except StopIteration: # No more items
break
# statements
...
Generators
Generators are provided specially by the language.The previous natural numbers stream can be done by:
def nats():
n = 1
while True:
yield n
n += 1
ns = nats() # yield returns a generator object first time called
ns.next() # => 1
next(ns) # => 2 # resumes at n+=1 and loops to yield again
Generators are essentially sequence generators (if they produce more than 1 thing) and so they are iterable like any other sequence.
for n in ns:
....
break # break at some point or we'll never get out this loop
Unlike a normal iterable can only be consumed once.
Succinct declarative construction of a list based upon mapping (transforming) and/or filtering elements in a sequence. Use plentifully!
Mapping/transforming example only.
ns = [i * 2 for i in range(10)] # => [0, 2, 4, ... 16, 18]
equivalent to:
map(lambda i: i * 2, range(10))
Filtering example only:
ns = [i for i in range(100) if isprime(i)] # => [1,3,5,7,11,13,17...]
equivalent to:
filter(isprime, range(100))
Map and filter:
ns = [i * 2 for i in range(100) if isprime(i)] # => [2,6,10,14,18,22,26,34...]
equivalent to:
map(lambda i: i * 2, filter(isprime, range(100)))
The list comprehension is more readable than map + filter combined.
It's more declarative not applicative.
In the same way you can nest loops you can nest sequence generators:
[(x,y) for x in range(5) for y in range(5) if x * y < 5] # =>
[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4),
(2, 0), (2, 1), (2, 2), (3, 0), (3, 1), (4, 0), (4, 1)]
Generator comprehension
Use parentheses instead of list syntax around a comprehension and you get a
generator comprehension so each element is lazily evaluated.
primesX2 = (n * 2 for n in range(123457890) if isprime(n))
Modularise code via declarative generator pipelines:
hugeLogFiles = (open(file) for file in glob.glob("*.log"))
splitLineStream = (line.split(someDelimiter) for line in hugeLogFiles)
matchingLines = (lineData for lineData in splitLineStream if foobar(lineData))
results = (int(lineData[0]) + int(lineData[2]) for lineData in matchingLines)
# consume the result data you're interested in...
We declare what want to do with a whole stream in each generator comprehension
Python 3 Extra Comprehensions
Set comprehensions:
{ x for x in foo if check(x) }
set([x for x in foo if check(x)]) # In python 2, extra overhead
Dictionary comprehensions:
{ xToKey(x) : xToValue(x) for x in foo if something(x) }
Backported to python2.7
dir([object]) -> list of strings
If called without an argument, return the names in the current scope.
Else, return an alphabetized list of names comprising (some of) the attributes
of the given object, and of attributes reachable from it.
If the object supplies a method named __dir__, it will be used; otherwise
the default dir() logic is used and returns:
for a module object: the module's attributes.
for a class object: its attributes, and recursively the attributes
of its bases.
for any other object: its attributes, its class's attributes, and
recursively the attributes of its class's base classes.
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__',
'__delslice__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__',
'__getitem__', '__getslice__', '__gt__', '__hash__', '__iadd__', '__imul__',
'__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__',
'__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__',
'__rmul__', '__setattr__', '__setitem__', '__setslice__', '__sizeof__',
'__str__', '__subclasshook__',
'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
Python has strong introspection capabilities.
Most properties of the python runtime and the objects are available for examination at runtime and change at runtime.
You can examine the stackframe.
You can examine functions, grab bytecode etc.
You can examine and add new members/functions to class instances & modules.
You can examine and add new static member variables to classes.
You could even change a class' baseclass.
inspect is a great module.
dir is probably the best function for examining an object. What does it do?
print dir.__doc__
E.g. at the shell:
dir( [ ] ) # dir an empty list instance
type is another useful function (actually callable class) for introspection
type( [ ] ) # type of list class instance is 'list'. [ ].__class__.__name__
type( list ) # 'type'. Classes are objects too!
getattr, hasattr functions
hasattr([], 'append') # True
getattr(sys, 'path') # same as sys.path
Dynamic Attribute Lookup
A class that transparently wraps an expensive to create object.
Shows hooking into attribute access, variadic function args.
def makeExpensiveObject(param1, param2):
...
thing = Lazy(makeExpensiveObject, param1=foo, param2=bar)
x = thing.someAttribute # only at this point is the object created.
class Lazy(object):
def __init__(self, calculateFunction, **calcFuncKWArgs):
self._wrappedValue = None
self._calculator = calculateFunction
self._calculatorArgs = calcFuncKWArgs
def __getattribute__(self, name):
# Need to use the object.__getattribute__ method otherwise self.x, or
# self.__dict__['x'] etc. will just cause infinite recursion to this method.
wrappedValue = object.__getattribute__(self, '_wrappedValue')
if wrappedValue is None:
calculatorFunc = object.__getattribute__(self, '_calculator')
calculatorKWArgs = object.__getattribute__(self, '_calculatorArgs')
# Note ** unpacks the keyword args dict as function keyword parameters.
# Note ** replaces the built-in apply function in usage, deprecating apply.
wrappedValue = calculatorFunc( **calculatorKWArgs )
object.__setattr__(self, '_wrappedValue', wrappedValue)
return getattr(wrappedValue, name)
What's a callable?
Anything that can be called like a function.
This includes class objects!
__call__ is a method on objects, much the same as operator()() in C++ or an implementation of the Callable interface in Java.
Decorators - dynamic metaprogramming
The easiest metaprogramming feature in python! Provides syntactic sugar for metaprogramming with callables.
Decorator is:
Java devs need to remember that whilst the @ syntax is very similar, Java's annotatations are about adding metadata to objects and python decorators are about metaprogramming - transforming code objects.
staticmethod vs classmethod
static methods are immutable, they cannot be overriden. They are simply functions scoped to a class.
class methods take the class they are called on as the first argument and can be overrided (unlike java static methods).
class Foo(object):
bar = 1
@staticmethod
def f():
print Foo.bar
@classmethod
def g(cls):
print cls.bar
class Baz(Foo):
bar = 2
Foo.f() # => 1
Baz.f() # => 1
Foo.g() # => 1
Baz.g() # => 2
Decorators - syntactic sugar
Pre python2.4 and without decorators one would do this:
class Foo(object):
def f(a, b, c): # note no 'self' parameter
...
f = staticmethod(f) # rebinds Foo.f function attribute to the transformed
Foo().f(1,2,3) # Can call the static method on a class object or
Foo.f(1,2,3) # with an instance object. User friendly.
Decorators are syntactic sugar for what we just did, so the following class is the same:
class Foo(object):
@staticmethod
def f(a, b, c):
...
More decorator examples
It's up to the decorator to return a sensible new function.
Below are four trivial examples showing salute decorators in action. Examples with:
The python interpreter decoration process differs when using decorator arguments vs no decorator arguments.
class Foo(object):
def __init__(self):
self._lock = threading.Lock()
@synchronized_method
def bar(self):
...
@synchronized_method
def baz(self):
...
The attribute name '_lock' matches the default lock name expected by the decorator.
Instance based synchronised locking
def synchronized_method(lockSymbolName="_lock"):
"""Decorate a method synchronising on the lock attribute
found in the instance of the decoratees class's method. Must be a method,
not a function as we assume the first arg is the self parameter.
"""
def wrap(f):
def synchronized_function(*args, **kw):
assert args, "synchronized_method decoratee no self"
try:
lock = getattr(args[0], lockSymbolName)
except AttributeError:
assert False, "synchronized_method decoratee no self.lock..."
with lock:
return f(*args, **kw)
return synchronized_function
return wrap
A Synchronised Decorator
Python has no 'synchronized' keyword.
In java this acquires the builtin lock that each java object contains (by virture of being derived from the base Object class).
Let's try to emulate java's synchronized keyword:
def synchronized(lock):
"""Decorate a method with this and pass in a threading.Lock object to
ensure that a method is synchronised over the given lock.
"""
def wrap(f):
def synchronized_function(*args, **kw):
lock.acquire()
try:
return f(*args, **kw)
finally:
lock.release()
return synchronized_function
return wrap
class Foo(object):
_lock = threading.Lock()
@synchronized(_lock)
def bar(self):
...
@synchronized(_lock)
def baz(self):
...
Ok, so this is a class level lock, not instance level.
More decorator examples at: http://wiki.python.org/moin/PythonDecoratorLibrary
Decorating your decorators
You can apply a decorator to your decorators ad infinitum.
One module that is meant for doing this is the decorator module.
It is meant to make writing decorators even easier.
http://pypi.python.org/pypi/decorator
Aren't decorators easy enough?
Well you can soon find out that the function returned by your decorator doesn't have the same signature and doesn't have the same documentation string, something IDEs and code inspectors will miss. You'd have to do some extra work to keep these in place.
Consider the previous foo function decorated via salute
foo.__name__ # 'salute', undecorated that would be 'foo'
foo.__doc__ # None, undecorated would be "adds a & b"
from inspect import getargspec
getargspec(foo)
# => ArgSpec(args=[], varargs='args', keywords=None, defaults=None)
Undecorated:
# => ArgSpec(args=['a', 'b'], varargs=None, keywords=None, defaults=None)
An @decorator on our decorator function salute fixes this.
Decorate the decorated
It should be no suprise that you can decorate with many decorators just as we could chain passing a function to multiple callables.
@synchronised(_lock)
@logging
@salute("hey", "ho")
@typecheck(str, str, int)
@condition(..., ...) # pre/postconditions
def foo(a,b):
...
Whilst there isn't a standard python decorator library, these are all existing decorator library examples. Except no one would use salute.
Aspect Oriented Programming
Java programmers have probably realised most (all?) of the benefits of AOP can be gotten via decorators, with greater simplicity.
Decorators require no xml, yay!
Before, after and around decorators are simple.
With AOP in Java it's easier to specify multiple decoratees via a regular expression. Python could use metaclasses for this.
Maybe you are concerned that python's "public" class attributes don't allow changing the implementation behind the scenes easily. Then meet the builtin property function (callable class since python2.6).
It's used to simulate attribute lookup, but actually calls a method. If your previously simple public attribute becomes a calculated value then use a property.
From the python library docs:
class C(object):
def __init__(self):
self._x = None
@property
def x(self):
"""I'm the 'x' property."""
return self._x
@x.setter
def x(self, value):
self._x = value
@x.deleter
def x(self):
del self._x
c = C()
c.x # calls C.x()
c.x = 1 # calls x.setter
del c.x # calls x.deleter
I've yet to use them myself, so won't go into much detail.
Decorators are being used more these days, perhaps still considered rare though, at least to write.
Metaclasses are extremely rare (to most python devs)!
Pre the arrival of decorators, metaclasses did all of python's metaprogramming, now they are for rarer things like adding or removing methods & attributes at class instantiation time, whilst modifying class methods is left to decorators.
We might need a metaclass if:
__new__
Rarely seen in code, probably only with metaclasses and
for allowing subclasses of immutable types.
__new__ is a static method even though you don't specify @staticmethod
Creating a metaclass
A metaclass is the class of a class object, the blueprint for creating a class as a class is the blueprint for a class instance. The default metaclass is the 'type' object.
class MetaFoo(type):
"""Pointless metaclasses that prints when it's creating
and initialising a class object.
"""
def __new__(meta, name, bases, classDict):
print "Allocating {0}'s memory".format(name)
print "classDict: ", classDict
return type.__new__(meta, name, bases, classDict)
def __init__(meta, name, bases, classDict):
print "Init'ing (configuring) {0}".format(name)
super(MetaFoo, meta).__init__(name, bases, classDict)
class Foo(object):
__metaclass__=MetaFoo
# interpreter will have printed "Allocating...Foo" and "Init'ing...Foo"
# the classDict for this "emptyclass" has the keys __module__, #__metaclass__
f = Foo()
type(f) # class Foo
type(Foo) # class MetaFoo
type(f.__class__) # class MetaFoo
I'll leave you to investigate further!
Three part metaclass programming article:
http://www.ibm.com/developerworks/linux/library/l-pymeta.html
Meta-classes made Easy, how to automatically apply a decorator to every method in a class:
http://www.voidspace.org.uk/python/articles/metaclasses.shtml
An easter egg to try at your python shell:
>>> import this
The whitespace indentation is an example of ceremony reduction as is no 'new' (a la java/C++) keyword when creating object instances.
Some basic productivity helpers are:
Prefer non-destructive updates
Non destructive updates make it easier to chain & compose calls together as you have new data returned, not None returned.
No worry about aliasing and mutating other peoples data.
Sadly many python methods are destructive like C++/Java.
E.g. list.reverse, though there's the builtin function reversed to return a new copy. Same for list.sort vs sorted.
This article is a great read, covering the different memory allocation styles developers prefer in different languages:
http://steve.yegge.googlepages.com/allocation-styles
Allocation style
# looping 10 times
for i in range(10):
...
range creates a list [0..9].
Python has no for loop that initialises a value, updates it and finishes the loop when when the value meets some condition.
The following is very unpythonic:
i = 0
while i < 10:
...
i += 1
It's too much work for the developer even if it's more efficient in some ways. It's more prone to bugs and too verbose. If you do need an index value when looping over sequence try:
for index, thing in enumerate(things):
...
foos = [...] #some list of numbers
nsTimes2 = [lambda n: n *2 for n in foos]
Again, even if foos is not needed by something else, the list comprehension is pleasanter than:
for index, n in enumerate(foos):
foos[index] = n * 2
It also doesn't leave the local variable n around at the end of the loop.
Consider the following. Walking the alias string twice can be considered inefficient, but simpler than looking for one char and one number in a loop and breaking when both have been found.
def randomAlias():
"""Generate a random name 3-11 characters with a mix of
alphanumeric characters.
"""
aliasLength = random.randint(3, 11)
validCharacters = string.letters + string.digits
while True:
alias = "".join( [random.choice(validCharacters)
for i in range(aliasLength)] )
if any( (c in string.letters for c in alias) ) and
any( c in string.digits for c in alias ):
return alias
Other points
data = [(foo, bar, baz,), (x, y, z), ...]
for item in data:
...
wibble = frobfoo(item[0])
Indexing is often considered ugly. Remember to use destructuring binds instead:
for foo, bar, baz in data:
...
data2 = [ [x, y, (foo, bar)], [p, q, (baz, wibble)], ...]
for a, b, (foo, bar) in data2:
...
for a, b, foobarTuple in data2:
...
Catch exceptions instead of checking
Sometimes it's easier to run a piece of code and catch some specific exceptions instead of making many checks. That is "it's easier to ask forgiveness than permission" so do not "look before you leap".
x = None
if foo(bar) and baz(bar) and ...:
x = frobBar(bar)
vs
try:
x = frobBar(bar)
except UseBarError:
x = None
'in' is a cool operator
x = foo()
if x == bar or x == baz or ...:
...
Often best to use 'in' for existence in a set:
x = foo()
if x in (bar, baz, ...):
...
Avoid Getters & Setters
class Foo(object):
def __init__(self, x):
self.x = x
def setX(self): ...
def getX(self): ...
Is considered verbose (and it's more inefficient). You can always make it a property later if the value needs computing.
Long 'switches'
Python has no switch statement
n = bar()
if n == 0:
frobBar(n, ...)
elif n == 1:
wibbleBar(n, ...)
...
else:
defaultBar(n, ...)
Dictionaries are more performant than long switches. More declarative in style. dict.get can return a default if the key is not found.
Zip it
zip is still a neat function, not just for functional programming.
names = ('Alice', 'Bob', ...)
ages = (18, 24, ...)
namesToAges = dict( zip(names, ages) )
zip bundles two different sequences into pairs e.g.
( ('Alice', 18), ('Bob', 24), ...)
XML is not the answer
XML is not agile and flexible compared to python (java yeah).
Python has data literals for dictionaries and lists, use them.
You don't need to stop, recompile, restart your program - you can reload() modules at runtime.
You can even evaluate strings as code or whole files - see exec, execfile and eval.
Design Patterns or Not!
16/23 pattens in the GoF "design patterns" book are trivial in a dynamic language like python. With first class functions design does not have to class centric.
With first class types some of the factory patterns in java are trivial compared to C++.
http://norvig.com/design-patterns for the full read.
Singleton
Ye Olde Global Variable, Single Class Instance
If you don't need subclasses of your singleton, the simplest solution, used most the time is:
module + private module scoped variables.
It's thread safe.
It's not lazy, but imports can be put in local blocks in python.
If subclassing is required, use the borg pattern where multiple instance maybe created but they all share the same state (slots/fields). We are concerned with state not identity after all.
class Foo(object):
__shared_dict = { }
def __init__(self):
self.__dict__ = self.__shared_state
Once understood this is so elegant it's probably better than messing with module globals!
Observer
Notification after an event.
Write a decorator that notifiers listeners after some event.
The decorator implementation is kept separate from the subject class.
Strategy
Algorithms should be substitutable. The following is pretty verbose:
class FooStrategy(object):
__call__(bar, baz):
pass
class FooConcreteStrategy1(FooStrategy):
__call__(bar, baz):
# Strategy1 algorithm
useStrategy(FooConcreteStrategy1())
In python we have dynamic types and first class functions so we just pass in a function, we don't need classes.
Command, template & visitor patterns are also simpler because of first class functions.
In summary, Don't go class Mad
Java is class centric. The "kingdom of nouns".
http://steve-yegge.blogspot.com/2006/03/execution-in-kingdom-of-nouns.html
Python is balanced between classes and functions.
For those with the time and interest one adventure can be found at:
http://www.pythonchallenge.com/
(THE END)
repeat(obj, n=None) - takes an object and repeats it n
times or infinitely.
split(string, delimiter) - same as str.split but lazy.
zipwith(binaryFunc, seqA, seqB) e.g. zipwith(
operator.add, [1,2], [3,4]) # => lazy sequence [4, 6]
zipwith is like map using two lists.
Use the pysh option. Has command prompt features e.g. 'cd', 'pwd', 'ls' and tab completion.
>>> lsmagic # pysh list of 'magic' commands
>>> range? # pysh help on some object
>>> help(range) # standard py help, less detail.
>>> help('for') # use py help for a keyword
>>> pycat foo.py # syntax highlighting cat
Create some object references - play with list, tuple and dict literals.
>>> whos # pysh show object references + detail
>>> x = 1
>>> store x # pysh x is saved in the pysh session
>>> reset # pysh clear all persistent refs
>>> psearch x* # glob regexp object refs in scope
the strings list with n lots of the delimiter
between each string in the list.
>>> import re # import some module
>>> edit re # open in EDITOR env var
# write history lines to file foo.py
>>> save c:/foo 1-3 5
>>> Out # pysh dict of stdout responses
>>> In # pysh list of input history
>>> hist? # learn to use command history
# multiple repeatable shell commands with one command
>>> macro foobar 1-3 5
>>> time sum(range(10000000)) # time?
>>> timeit sum(range(10000000)) #timeit?
>>> !dir # external shell command returned
Takes a path string to a file
Takes a string to search for
Returns all lines that have that substring in it
Develop code at the repl, load, change, reload, interact with
the live system
Technically comma ',' is the tuple operator
Hello World 1
File hello.py, run at the commandline
python hello.py
Contents
print "hello, world!"
print does what is says on the tin and adds a new line. To skip the new line you could say:
print "hello, world!",
print is a statement, not a function. This sucks and is fixed in python 3.
Other things to know:
Modules effectively force namespaces onto the programmer (the module is like a completely static class in java).
First Consider:
for 2.x until more libraries have moved over to 3.x.
(.NET) etc. CPython = stable. JVM etc = better compiler tech,
but fewer features.
I suggest getting the latest (2.6.4) CPython from www.python.org.
The IPython shell + PyReadline - http://ipython.scipy.org/moin/
There is a tutorial that comes with python and an excellent free book for experienced programmers:
http://diveintopython.org
http://diveintopython3.org
Define you own sliceable objects:
object.__setslice__
object.__getslice__
def salute(func):
"""Silly decorator"""
def saluter(*args):
print "hello"
ret = func(*args)
print "you"
return ret
return saluter
class salute(object):
"""Silly decorator"""
def __init__(self, func):
self.func = func
def __call__(self, *args):
print "hello"
ret = self.func(*args)
print "you"
return ret
Example usage:
@salute
def foo(a,b):
"""adds a & b"""
return a + b
@salute("hey", "ho")
def foo2(a,b):
"""adds a & b"""
return a + b
foo(1,2) # => 3, prints "hello\nyou"
foo2(1,2) # => 3, prints "hey\nho"
def salute(beforeCall, afterCall):
"""Silly decorator"""
def wrap(func):
def saluter(*args):
print beforeCall
ret = func(*args)
print afterCall
return ret
return saluter
return wrap
class salute(object):
"""Silly decorator"""
def __init__(self, beforeCall, afterCall):
self.beforeCall = beforeCall
self.afterCall = afterCall
def __call__(self, func):
def wrapped(*args):
print self.beforeCall
ret = func(*args)
print self.afterCall
return wrapped