What is the difference between Python 2 and Python 3?
Below are the key differences which I have observed and encountered frequently while working on Python 2 and Python 3.
Integer division
Python2 : The return type of a division (/) operation depends on its operands. If both operands are of type int, floor division is performed and an int is returned. If either operand is a float, classic division is performed and a float is returned. The // operator is also provided for doing floor division no matter what the operands are.
>>> 3/2
1
>>> -3/2
-2
>>> 3//2
1
>>> -3//2
-2
Python3 : Division (/) always returns a float. To do floor division and get an integer result (discarding any fractional result) you need to use // operator.
>>> 3/2
1.5
>>> -3/2
-1.5
>>> 3//21 >>> -3//2-2
Input function
Python2 : When you use input() function, Python dynamically converts the data type based on your input. So input() function can return anything int, str, float, bool, list etc based on your input. For instance,
>>> val = input("Enter any value: ")
Enter any value: 7
>>> type(val)
int
>>> val = input("Enter any value: ")
Enter any value: 7.0
>>> type(val)
float
>>> val = input("Enter any value: ")
Enter any value: 'abc'
>>> type(val)
str
>>> val = input("Enter any value: ")
Enter any value: True
>>> type(val)
bool
>>> val = input("Enter any value: ")
Enter any value: [1,2,3,4,5]
>>> type(val)
list
When you use raw_input(), Python will simply return string irrespective of your input value data type.
>>> val = raw_input("Enter any value: ")
Enter any value: 7
>>> type(val)
str
>>> val = raw_input("Enter any value: ")
Enter any value: 7.0
>>> type(val)
str
>>> val = raw_input("Enter any value: ")
Enter any value: 'abc'
>>> type(val)
str
and so on..
Python3 : In Python 3, input() function returns string type (acts like raw_input()). It’s confusing sometimes when you are transiting from Python 2 to Python 3.
>>> val = input("Enter any value: ")
Enter any value: 7
>>> type(val)
str
Note when you are using input() in Python 3 : If you are using Python 2.7, it’s good idea to use raw_input() when you are getting user inputs.
In Python 3, you would use input(). Both the functions interprets all input as a string based on which version of Python you are using.
Sometimes, I feel instead of Python doing type change dynamically it’s always safe to handle it manually. In Python 3, when you use the input() func, every input will be interpreted as string and sometimes it creates unexpected results. Consider the below example,
>>> a = input("Enter first number: ")
Enter first number: 3.4
>>> b = input("Enter second number: ")
Enter second number: 5.7
>>> print a + b
3.45.7
# It's because
>>> type(a), type(b)
(str, str)
# In order to fix this you need to apply float() function when user is prompted for input.
Round function
Python2 : Return the floating point value number rounded to n digits after the decimal point. If n digits is omitted, it defaults to zero. The result is a floating point number. Values are rounded to the closest multiple of 10 to the power minus n digits; if two multiples are equally close, rounding is done away from 0 (so, for example, round(0.5) is 1.0 and round(-0.5) is -1.0).
>>> round(3.5)
4.0
Python3 : Return number rounded to n digits precision after the decimal point. If n digits is omitted or is None, it returns the nearest integer to its input.
>>> round(3.5)
4
Print function
Python2 : Extra pair of parenthesis was not mandatory.
>>> print "Hello"
Hello
>>> print 'Hello'
Hello
>>> print ("Hello")
Hello
>>> print ('Hello')
Hello
Python3 : Extra pair of parenthesis is now mandatory.
>>> print "Hello"
SyntaxError: Missing parentheses in call to 'print'.
>>> print 'Hello'
SyntaxError: Missing parentheses in call to 'print'.
>>> print ("Hello")
Hello
>>> print ('Hello')
Hello
ASCII, Unicode and Byte types
(Please ignore below question if you already have this understanding)
What is ASCII, Unicode and Byte type in simple english?
ASCII defines 128 characters, which map to the numbers 0–127. ASCII uses 7 bits to represent a character. By using 7 bits, we can have a maximum of 2^7 (= 128) distinct combinations. Which means that we can represent 128 characters maximum. The last bit (8th) is used for avoiding errors as parity bit.
Most ASCII characters are printable characters of the alphabet such as a, b, c, A, B, C, 1, 2, 3, ?, &, ! etc. ASCII was meant for English only. The others are control characters such as carriage return, line feed, tab, etc.
Binary representation of a few characters will be like: 1000001 -> A (ASCII code - 65) or 0001101 -> Carriage Return (ASCII code - 13)
ASCII (including ASCII extended 2^8=256 characters) solves the problem for languages that are based on the Latin alphabet. But what about the others needing a completely different alphabet? Russian? Hindi? Chinese? We would have needed an entirely new character set, that's the rational behind Unicode.
Unicode defines (less than) 2^21 characters, which, similarly, map to numbers 0–2^21 (though not all numbers are currently assigned, and some are reserved). Unicode is a superset of ASCII, and the numbers 0–127 have the same meaning in ASCII as they have in Unicode. For example, the number 65 means "Latin capital 'A' ". Because Unicode characters don't generally fit into one 8-bit byte, there are numerous ways of storing Unicode characters in byte sequences, such as UTF-32 and UTF-8.
Python2 : It has ASCII string type, a separate unicode type, but there is no byte type.
>>> type(unicode('a'))
unicode
>>> type(u'a')
unicode
>>> type(b'a')
str
Python3 : We have unicode strings, and byte type.
>>> type(unicode('a'))
NameError: name 'unicode' is not defined
>>> type(u'a')
str
>>> type(b'a')
bytes
Range function
Python2 : It has range and a separate xrange function. When you need actual list use range and when you need to iterate one object at a time use xrange like a generator which will be faster and save memory.
>>> %timeit [i for i in range(1000)]
The slowest run took 4.72 times longer than the fastest. This could mean that an intermediate result is being cached. 10000 loops, best of 3: 45.3 µs per loop
>>> %timeit [i for i in xrange(1000)]
10000 loops, best of 3: 40.1 µs per loop
Python3 : Here, range does what xrange used to do and xrange does not exist separately. If you want to write code that will run on both Python 2 and Python 3, you can't use xrange.
>>> %timeit [i for i in range(1000)]
42.6 µs ± 1.23 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit [i for i in xrange(1000)]
NameError: name 'xrange' is not defined
Exception handling
Python2 : Syntax is little different for Python3. You need to include as keyword.
try:
BlahBlah
except NameError, error:
print error, " => ERROR IS HERE!"
# output
name 'BlahBlah' is not defined => ERROR IS HERE!
try:
BlahBlah
except NameError as error:
print error, " => ERROR IS HERE!"
# output
name 'BlahBlah' is not defined => ERROR IS HERE!
Python3 : You need to include “as” keyword.
try:
BlahBlah
except NameError as error:
print (error, " => ERROR IS HERE!")
# output
name 'BlahBlah' is not defined => ERROR IS HERE!
Global Namespace Leak
Python2 : Consider below example, how global variable changes.
num = 7
print (num)
mylist = [num for num in range(100)]
print (num)
# output
7
99
Python3 : It’s fixed now, there is no namespace leak.
num = 7
print (num)
mylist = [num for num in range(100)]
print (num)
# output
7
7
Functions and methods that don’t return lists anymore in Python3
zip()
map()
filter()
dictionary’s .keys() method
dictionary’s .values() method
dictionary’s .items() method
List Comprehensions
Python2
[item for item in 1,2,3,4,5]
[1, 2, 3, 4, 5]
Python3 : You need to use extra pair of parenthesis
[item for item in (1,2,3,4,5)]
[1, 2, 3, 4, 5]
I just mentioned 10 differences. Please add in comments if you can recall other differences.
Comments