Adapting your sources for Python 3 without loosing Python 2 compatibility

As a follow up of the previous article I'd like to list here some hints about adapting your Python2 sources for the migration to Python3 that I discovered while doing the same job on some of my scripts. The list is not complete because, as I said, it's the result of a personal experience and not of a thorough exploration.

To write code compatible with Python version 2 and 3 you have to enable some backports that are available starting from the 2.5 version of the language.

General changes

“long”s will be gone

In Python 3 the distinction between int and long will disappear, so as the "0L" syntax. To avoid syntax problems you can replace "0L" with "long(0)" and in Python 3 you'll redefine long=int so that everything will work again. At the beginning of the script you have to check for the Python version to enable the substitution:

import sys
python_version_major = sys.version_info [ 0 ]

if python_version_major > 2:
 long = int

xrange function replaced by range

Like the previous case it's enough to check for the Python version to rename the xrange function to "range" when it's needed.

The “unicode” type will disappear

With Python 3 all strings will be unicode, so the "str" datatype will be unicode by default. Like the previous cases (again!) testing the Python version will be enough to do unicode=str when needed.

Some functions will return an iterator in place of a list

These functions are: zip, range, map.
Starting from Python 3 they'll return an iterator, while in Python 2 they return a list. For compatibility you can enclose the result of the function in a "list()" call that forces it to be a list in Python 3 also.

next method replaced by next

When you define an iterator you must use the "__next__" method in place of "next". So, for compatibility you can define them to be equal:

def next ( self ):
  ...
__next__ = next

This implies also that to retrieve the next element of an iterator you cannot use the "next" method any more, but you must use the "next" function introduced with Python 2.6:

next(foo) instead of foo.next()

nonzero method replaced by bool

To implement the truth test on an object in Python 2 you use the __nonzero__ method, while in Python 3 this will be replaced by the __bool__method. So, for compatibility it's enough to define them to be equal:

def __nonzero__ ( self ):
  …
__bool__ = __nonzero__

Exceptions cannot be strings

In other words this syntax is now invalid:

raise SerializationError, “decoding error”

but it should be:

raise SerializationError ( “decoding error” )

Changes to modules

Removed “exceptions” module

Exceptions don't belong to the "exceptions" module any more but they've been moved to __builtins__, so the "exceptions" modules has become useless and it has been removed.

“md5”, “sha”, etc. modules collected under “hashlib”

Modules related to hash generation have been collected under hashlib, so that imports must be changed. For example the following instruction:

import md5

should be:

from hashlib import md5

Warning: the "sha" module is now split into different versions: "sha1", "sha224" etc., so writing the import you should remember to change its name to preserve compatibility:

from hashlib import sha1 as sha

Changes introduced from Python 2.5

Absolute imports

In Python 3 imports are absolute: to import a file from the same package you must use the dot "." before the name of the file to make it relative to the current package.
To enable this behaviour from Python 2.5:

from __future__ import absolute_import

and after it imports must be written as follows:

from .foo import Foo
from . import foo

etc., while the usual syntax:

import foo

will always try to import "foo" from the package root.

This of course is a problem: relative imports are based on the content of the __name__ attribute of a module. If you execute a module as __main__, for example when you're testing it, the interpreter doesn't know where to start looking for modules imported with a relative path and raises an exception. This means that you cannot execute modules as standalone programs, but you are forced to install and import them, after which relative paths will work. On the contrary, you can use only absolute paths, but this still forces you to install a module before you can run it, let alone that if you move it into another directory of your package you have to fix all the paths in the imports.

In other words: the first module must be imported with an absolute path, then all the relative imports will work.

This is annoying, to say the least, because you cannot use the __name__ == '__main__' trick any more to test a single module.

A rather complex solution is to add the following code to the beginning of the module you need to make a standalone executable:

if __name__ == '__main__':
  import os, sys
  # fullpath of this file
  __this_path = os.path.abspath ( sys.argv [ 0 ] )
  # module name (this file name without extension)
  __this_name = os.path.splitext ( os.path.basename ( __this_path ) ) [ 0 ]
  # fullpath of the directory that contains this file
  __parent_path = os.path.dirname ( __this_path )
  # name of the directory that contains this file
  __parent_name = os.path.basename ( __parent_path )
  # add to the path the directory up two levels (that is not the one
  # that contains the script, but one more above it)
  sys.path.insert ( 0, os.path.dirname ( __parent_path ) )
  # import as a module the directory that contains the script
  __import__ ( __parent_name )
  # change the script name
  __name__ = '%s.%s' % ( __parent_name, __this_name )
  # from this point on relative imports work

These rows transform the directory containing the module into a package adding it to sys.path and modifying the module's internal name (the one the interpreter sees) in '<package>.<module>' so that relative imports will start to work. Working with paths you aren't bound to the module's location that can change.

Be warned: by using sys.path.insert you may insert names that will override identical system names (if your package happens to have overlapping directory names with system ones); by using sys.path.append the risk is to use a previously installed version of the package instead of the one you're testing. I don't think there's a way to fix both issues at the same time.

Changes introduced from Python 2.6

Print function

from __future__ import print_function
def print(*args, sep=' ', end='\n', file=None)

New syntax to capture the value of an exception

In Python 3 the “as” keywork will be mandatory:

try:
    …
except TypeError as exc:
    …

Unicode format for all literal strings

from __future__ import unicode_literals

This probably doesn't mean that strings read from the terminal will be unicode.

Difference between unicode and 8 bit

Everything that is 8 bit should be built with "bytes" or "b": in Python 2 "bytes" is an alias for "str", so the result is different from Python 3, but you have to use "bytes" in Python 2 in instructions like "isinstance(x, bytes)" to make the 2to3 tool understand that you're managing bytes and not unicode strings.

"bytes" in Python 2 returns an immutable string. If you want a mutable byte string you should use "bytearray".

Notation for binary and octal numbers

The new syntax for octal numbers is "0o" instead of the simple "0" at the beginning. There's also the new "0b" syntax to identify binary numbers together with the "bin" function that converts a number in binary format.

Integer division

The "classical" division is an integer divison. To switch to the "true" division you need:

from __future__ import division

True division for ints and longs will convert the arguments to float and then apply a float division. That is, even 2/1 will return a float (2.0), not an int. For floats and complex, it will be the same as classic division.

Integer division will become "//".

"callable" removed

"callable" has been deprecated; it has been replaced by:

isinstance(x, collections.Callable)

"has_key" removed

The "has_key" method has been removed: you must use the "in" operator.

"apply" removed

"apply" has been deprecated: you must use the extended function call:

func(*args, **kwargs)

Changes introduced from Python 2.7

DeprecationWarning

These warnings has been disabled by default. While developing you should enable them. There are some ways to do it:

by using the "-Wdefault" flag from the command line;
by setting the PYTHONWARNINGS environment variable to "default";
by calling warnings.simplefilter('default') in your code.

Dictionary: keys, values e items

Starting from Python 3 these methods return a "view" that is automatically updated when the dictionary is updated. In Python 2.7 you have this behaviour using the viewkeys, viewvalues e viewitems methods that are automatically converted by the 2to3 tool.

Alternatively you can keep on using keys, values e items the old way but enclosing them in a "list()" function, that is converting the result to a list, so that you get a list in Python 3 also.

Literal set syntax

Braces: {1, 2, 3, 4}
Empty set: set()

Dictionary and set comprehensions

>>> {x: x*x for x in range(6)}
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
>>> {('a'*x) for x in range(6)}
set(['', 'a', 'aa', 'aaa', 'aaaa', 'aaaaa'])

cmp method removed

The __cmp__ method will be removed in Python 3.0: you must define every single method for comparing an object (__lt__, etc.). There's an utility that helps in the task.

importlib

Python 3.1 has a richer importlib that contains a complete reimplementation of the import mechanics. The Python 2.7 version contains only the import module function.

Non maskable interrupt

Search This Blog

Adapting your sources for Python 3 without loosing Python 2 compatibility

General changes

“long”s will be gone

xrange function replaced by range

The “unicode” type will disappear

Some functions will return an iterator in place of a list

next method replaced by next

nonzero method replaced by bool

Exceptions cannot be strings

Changes to modules

Removed “exceptions” module

“md5”, “sha”, etc. modules collected under “hashlib”

Changes introduced from Python 2.5

Absolute imports

Changes introduced from Python 2.6

Print function

New syntax to capture the value of an exception

Unicode format for all literal strings

Difference between unicode and 8 bit

Notation for binary and octal numbers

Integer division

"callable" removed

"has_key" removed

"apply" removed

Changes introduced from Python 2.7

DeprecationWarning

Dictionary: keys, values e items

Literal set syntax

Dictionary and set comprehensions

cmp method removed

importlib

Labels

Comments

Post a Comment

Most popular posts

Pairing the Raspberry Pi 3 with your Playstation 3 controller

JSON Web Token Tutorial: An Example in Laravel and AngularJS

Software Release Management For Small Teams

Non maskable interrupt

Adapting your sources for Python 3 without loosing Python 2 compatibility

General changes

“long”s will be gone

xrange function replaced by range

The “unicode” type will disappear

Some functions will return an iterator in place of a list

next method replaced by __next__

__nonzero__ method replaced by __bool__

Exceptions cannot be strings

Changes to modules

Removed “exceptions” module

“md5”, “sha”, etc. modules collected under “hashlib”

Changes introduced from Python 2.5

Absolute imports

Changes introduced from Python 2.6

Print function

New syntax to capture the value of an exception

Unicode format for all literal strings

Difference between unicode and 8 bit

Notation for binary and octal numbers

Integer division

"callable" removed

"has_key" removed

"apply" removed

Changes introduced from Python 2.7

DeprecationWarning

Dictionary: keys, values e items

Literal set syntax

Dictionary and set comprehensions

__cmp__ method removed

importlib

Labels

Comments

Post a Comment

Most popular posts

Pairing the Raspberry Pi 3 with your Playstation 3 controller

JSON Web Token Tutorial: An Example in Laravel and AngularJS

Software Release Management For Small Teams

next method replaced by next

nonzero method replaced by bool

cmp method removed