1.1 The Python Memory Model Introduction PDF

Title 1.1 The Python Memory Model Introduction
Author Ananya Jha
Course Introduction to Computer Science
Institution University of Toronto
Pages 9
File Size 634.9 KB
File Type PDF
Total Downloads 31
Total Views 135

Summary

Download 1.1 The Python Memory Model Introduction PDF


Description

07/01/2020

1.1 The Python Memory Model: Introduction

1.1 The Python Memory Model: Introduc Before we dive into the CSC148 material proper, we’ll review a few fundamental concepts from CSC108. We start with one of the most important ones: how the Python programming language represents data.

Data All data in a Python program is stored in objects that have three components: id, type, and value. We normally think about the value when we talk about data, but the data’s type and id are also important. The id of an object is a unique identifier, meaning that no other object has the same identifier. Often Python uses the memory address of the object as its id, but it doesn’t have to; it just has to guarantee uniqueness. We can see the id of any object by calling the id function: >>> id(3) 1635361280 >>> id('words') 4297547872

We can see the type of an object by calling the type function: >>> type(3)

>>> type('words')

An object’s type determines what functions can operate on it. For example, we can call the function round on numeric types (such as int and float), but not on strings: >>> round(2) 2 >>> round(3.1419) 3 >>> round('hello!') Traceback (most recent call last): File "", line 1, in TypeError: type str doesn't define __round__ method

Types also determine the objects on which we can use built-in Python operators. 1 For example, the + operator works on two integers, and even 1 We’ll see later in the course that most of Python’s operators are actually implemented

using functions.

on two strings, but is not defined for adding an integer and a string together:

07/01/2020

1.1 The Python Memory Model: Introduction

Variables All programming languages have the concept of variables. In Python, a variable is not an object, and so does not actually store data; it stores an id that refers to an object that stores data. This is the case whether the data is something very simple like an int or more complex like a str. Consider this code: >>> x = 3 >>> x 3 >>> type(x)

>>> id(x) 1635361280 >>> word = 'bonjour' >>> type(word)

>>> id(word) 4385008808

The state of memory after the above piece of code executes is this:

We write the id and type of each object in its upper-left corner and upperright corner, respectively. The actual object id reported by the id function has many digits, and its true value isn’t important; we just need to know that each object has a unique identifier. So for our drawings we make up short identifiers such as id92. Notice that there is no 3 inside the box for variable x. Instead, there is the id of an object whose value is 3. We say that x refers to this object, or that x references this object. The same holds for variable word; it references an object whose value is 'bonjour'. Here are a couple of other things to notice: Since we did not write the code for the class that defines the str type, we know nothing about what data members it uses to store its contents. So we just write the value 'bonjour' inside the box. This is a perfectly fine abstraction. We didn’t draw any arrows. Programmers often draw an arrow when they want to show that one thing references another. This is great once you are very confident with a language and how references work. But in the early stages, you are much more likely t k t di ti if it d f (

07/01/2020

1.1 The Python Memory Model: Introduction

We saw above that Python will report to us what type(word) is. But it is really reporting the type of the object that word refers to. The variable word itself has no type. 2 In fact, Python doesn’t mind if we make word 2

This is different from many other languages, such as Java and C, where every variable has a type.

refer now to a different type of object, although this is almost surely a bad idea. >>> word = 'adieu' >>> type(word)

>>> word = 42 >>> type(word)

A brief aside on assignment statements and evaluating expression You’ve written code much more complex that what’s above, but may not have had to think in detail about all the small steps that that Python has to undertake to execute even a simple assignment statement. These details are foundational for writing and debugging the more complex code you will work on in csc148. So let’s pause for a moment and be explicit about two things.

Executing an assignment statement This is what Python does when an assignment statement is executed: 1. Evaluate the expression on the right-hand side, yielding the id of an object. 2. If the variable on the left-hand-side doesn’t already exist, create it. 3. Store the id from the expression on the right-hand-side in the variable on the left-hand side.

Evaluating an expression An assignment statement always has an expression on the right-hand side. Expressions can occur in other places also, for instance as arguments to a function call. When an expression is encountered, it must be evaluated. This always yields a value, which is the id of an object. This is what Python does when an expression is evaluated: If the expression is a variable, find the variable. If it doesn’t exist, this is an error. If it does exist, the value of the expression is the id stored in that variable. If the expression is a “literal value”, such as 176.4 or ‘hello’, create an object of the appropriate type to hold it. The value of the expression is the id of that object. If the expression is an operator, such as + or %, evaluate its two operands, apply the operator to them, and create a new object of th i t t t h ld th lt Th l f th i

07/01/2020

1.1 The Python Memory Model: Introduction

'Diane Horton' >>> # The old str object couldn't change, so Python made a new >>> # str object for the variable prof to refer to. Since it's >>> # a new object, it has a different id. >>> id(prof) 4405308016

We did not change the value stored in the object—we couldn’t, since strings are immutable—but rather changed what prof refers to, as shown here:

We will use the convention of drawing a double box around objects that are immutable. Think of it as signifying that you can’t get in there and change anything. Notice that in the example above we reassigned the variable prof, that is we made it refer to a new str object, and we could do this even though strings are immutable. Regardless of the mutability of any objects, we can always reassign a variable.

Mutable data types More complex data structures in Python are mutable, including lists, dictionaries, and user-defined classes. Let’s see what this means with a list: >>> x = [1, 2, 3] >>> x [1, 2, 3] >>> type(x)

>>> id(x) 50706312

Below, we perform two mutating operations on x, and check that its id hasn’t changed. Note that even changing the list’s size doesn’t change its id! >>> x[0] = 1000000 >>> x [1000000, 2, 3] >>> id(x) 50706312 >>> x.extend([10, 20, 30]) >>> x

07/01/2020

1.1 The Python Memory Model: Introduction

The lines x[0] = 1000000 and x.extend([10, 20, 30]) changed the value of the list object that x refers to. We say that these lines mutate the object that x refers to. (They also cause the creation of four new objects of type int.)

Aliasing When two variables refer to the same object, we say that the variables are aliases of each other. 3 3 My dictionary says that the word “alias” is used when a person is also known under a

different name. For example, we might say “Eric Blair, alias George Orwell.” We have two names for the same thing, in this case a person.

Consider the following Python code: >>> x = [1, 2, 3] >>> y = [1, 2, 3] >>> z = x x and z are aliases, as they both reference the same object. As a result,

they have the same id. You should think of the assignment statement z = x as saying “make z refer to the object that x refers to.” After doing so, they have the same id. >>> id(x) 4401298824 >>> id(z) 4401298824

In contrast, x and y are not aliases. They each refer to a list object with [1, 2, 3] as its value, but they are two different list objects, stored separately in your computer’s memory. This is again reflected in their different ids. >>> id(x) 4401298824

07/01/2020

1.1 The Python Memory Model: Introduction

Aliasing and mutation Aliasing is often a source of confusion for beginners, because it allows “action at a distance”: the modification of a variable’s value without explicitly mentioning that variable. Here’s an example: >>> >>> >>> >>>

x = [1, 2, 3] z = x z[0] = -999 x # What is the value?

The third line mutates the value of z. But without ever mentioning x, it also mutates the value of x! We call this a side effect. Imprecise language can lead us into misunderstanding the code. We said above that “the third line mutates the value of z”. To be more precise, the third line mutates the object that z refers to. Of course we can also say that it mutates the object that x refers to—they are the same object! A clear diagram like this can really help:

07/01/2020

1.1 The Python Memory Model: Introduction

Aliasing also exists for immutable data types, but in this case there is never any “action at a distance”, precisely because immutable values can never change. For example, a tuple is an ordered sequence like a list, but it is immutable. In the example below, x and z are aliases of a tuple object; but it is impossible to create a side effect on x by mutating the object z refers to, since we can’t mutate tuples at all. >>> x = (1, 2, 3) >>> z = x >>> z[0] = -999 Traceback (most recent call last): File "", line 1, in TypeError: 'tuple' object does not support item assignment

Changing a reference is not the same as mutating a value What if we did this instead? >>> >>> >>> >>>

x = (1, 2, 3) z = x z = (1, 2, 3, 40) x # What is the value?

Again, we have made x and z refer to the same object. So when we change z on the third line, does x also change? This time, the answer is an emphatic no, and it is because of the kind of change we make on the third line. Instead of mutating the object that z refers to, we make z refer to a new object. This obviously can have no effect on the object that x refers to (or any object). Even if we switched the example from using immutable tuples to using mutable lists, x would be unchanged.

07/01/2020

1.1 The Python Memory Model: Introduction

>>> id(x) 4401298824 >>> id(y) 4404546056 >>> id(z) 4401298824

What if we wanted to see whether x and y, for instance, were the same? Well, we’d need to define precisely what we mean by “the same.” We can use the == operator to compare the values stored in the objects they reference. This is called value equality. >>> x == y True >>> x == z True

Or, we can use the is operator to compare the ids of the objects they reference. With is, we are asking whether two variables reference the exact same object. This is called identity equality. >>> x is y False >>> x is z True

All built-in types have an implementation for == so that we can check for value equality; we’ll later see how to define == for our own classes.

A special case with immutable objects Because ints are immutable, there isn’t much point in Python creating a separate int object every time your a variable needs to refer to, say, 0. They can all refer to the very same object and no harm can be done since the object can never change. This explains the following code: >>> x >>> y >>> z >>> # >>> # >>> # >>> # >>> x True >>> x True >>> # >>> # >>> # >>> # >>> # >>> x True >>> x True >>> #

= 43 = 43 = x Of course we see that all three variables have value equality. They all reference an int object containing 43. Whether or not they are the same int object is irrelevant to "==". == y == z But "is" checks identity equality. We wouldn't have expected x and y to reference the same int object. But now we know that Python feels free to take a short-cut and not create a second int object holding the value 43, and in this case it did: is y is z We can confirm that x and y have the same id:

07/01/2020

1.1 The Python Memory Model: Introduction

It turns out that when Python does and doesn’t take the short-cut is quite complex, and it could even change from one version of Python to the next. But it makes no difference to our code’s behaviour; the only reason we need to be aware of it is so that we are not surprised when we see that two variables unexpectedly have identity equality.

CSC CSC CSC148 CSC148 148 148Notes Notes NotesTable Table Tableof of ofContents Contents Contents...


Similar Free PDFs