Files
File Paths
A file has a filename and a path. The path specifies the location of a file on the computer, as a hierarchy of folders (also called directories).
File C:\photos\2018\home.jpg
- Filename:
home.jpg
- Path:
C:\photos\2018
(Windows uses the back slash\
as the separator symbol in paths ) - Folders in the path (
C:
is called the root folder):C: {root} └── photos └── 2018
Windows file names and paths are not case sensitive: C:\photos\2018\home.jpg
is same as C:\PHOTOS\2018\HOME.JPG
.
File /Users/john/home.jpg
- Filename:
home.jpg
- Path:
/Users/john
(OS-X/Linux uses the forward slash/
as the separator symbol in paths ) - Folders in the path (the
/
at the start of the path is considered the root folder):/ {root} └── Users └── john
OS-X/Linux file names and paths are case sensitive. /Users/john/home.jpg
is NOT the same as /USERS/JOHN/HOME.JPG
The Python module os
contains functions for dealing with files and folders. For example, you can use os.getcwd()
to get the and os.chdir()
to change the working directory to a different location.
This code shows how to print/change current working directory
import os
cwd = os.getcwd() # store current working dir
print(cwd) # print current working dir
os.chdir('C:\\temp\\python') # change dir
print(os.getcwd()) # print current working dir
os.chdir(cwd) # change working dir back to original
print(os.getcwd())
C:\photos\vaction
C:\temp\python
C:\photos\vaction
Note how the path 'C:\\temp\\python'
uses double slash to escape the \
. In OS-X or Linux, it can be something like /user/john/python
(no need for double slash).
A path that specifies all folders starting from the root is an absolute path. A path that is specified relative to the current working directory is a relative path.
Assume the current working directory is C:\modules\tee3201
and you created a new folder inside it named exercises
and put a ex.txt
file in that folder.
- Absolute path of the file:
C:\modules\tee3201\exercises\ex1.txt
- Relative path of the file:
exercises\ex1.txt
In a path, you can use the dot .
as a shorthand to refer to the current working directory. Similarly, ..
can be used to refer to the parent directory.
If the current working directory is C:\modules\tee3201
, you can use any of the following to refer to C:\modules\tee3201\exercises\ex1.txt
.
exercises\ex1.txt
.\exercises\ex1.txt
..\tee3201\exercises\ex1.txt
..\..\modules\tee3201\exercises\ex1.txt
You can use os.makedirs()
function to create folders and os.removedirs()
to delete folders.
Example code showing how to create/delete directories
print(os.getcwd())
os.makedirs('ex\\w1')
os.chdir('ex\\w1')
print(os.getcwd())
os.chdir('..') # go to parent dir
print(os.getcwd())
os.chdir('..')
os.removedirs('ex\\w1')
C:\repos\nus-tee3201\sample-code
C:\repos\nus-tee3201\sample-code\ex\w1
C:\repos\nus-tee3201\sample-code\ex
os.path
module has many functions that can help with paths. For example, os.paths.join()
function can be used to generate file path that matches the current operating system.
Consider the code below:
cwd = os.getcwd()
print(os.path.join(cwd, 'ex', 'w2'))
If you run it on a Windows computer in the folder C:\modules\tee3201
, it prints C:\modules\tee3201\ex\w2
.
If your run it on a OS-X computer in the folder /Users/john
, it prints /Users/john/ex/w2
.
To ensure that your code can work on any OS, you are advised to use os.path.join()
function instead of hard-coding the .
contrasting hard-coding the separator vs using os.path.join()
:
- Bad (Works only on Windows):
os.makedirs('ex\\w1')
- Good (Works on both Windows and OS-X):
os.makedirs(os.path.join('ex', 'w1'))
Reading from Files
This section focuses on reading from text-based files (i.e., not binary files).
There are three steps to reading files in Python:
- Call the
open()
function to receive aFile
object. - Call the
read()
method on theFile
object to receive file content. - Close the file by calling the
close()
method on theFile
object.
The code below shows how to read from a text file.
| → | Output (contents of the
|
The 'r'
argument in open(file_path, 'r')
indicates that the file should be opened .
It is also possible to read the file content as a list of lines, using the readlines()
method.
The code below shows how to read file content as a list of lines.
file_path = os.path.join('data', 'items.txt')
f = open(file_path, 'r')
items = f.readlines()
print(items) # print as a list
for i in items: # print each item
print(i.strip()) # use strip() to remove linebreak at the end of each line
f.close()
['first line\n', 'second line\n', 'third line\n']
first line
second line
third line
Note how each line ends with a \n
which represents the line break. It can be removed using the strip()
method.
Writing to Files
Similar to reading from a file, writing to a file too is a three step process. One main difference is the file needs to be opened in the write mode.
The code below shows how to write to a text file.
file_path = os.path.join('data', 'items.txt')
f = open(file_path, 'w') # open in write mode
f.write('first line\n')
f.write('second line\n')
f.close()
contents of the items.txt
:
first line
second line
- The
'w'
argument indicates that the file should be opened in write mode. - Unlike the
print()
function that prints content in a new line every time, thewrite
function does not add an automatic line break at the end. You need to add a\n
at each place you want a line break to appear in the file.
To preserve original content and add to it, open the file in append mode. That is because opening a file in write mode and writing to it results in overwriting the content of the file contained before it was opened.
The code below shows how to append to a file.
f = open(file_path, 'a') # open in append mode
f.write('third line\n')
f.close()
contents of the items.txt
:
first line
second line
third line
CSV files
CSV files are often used as a simple way to save spreadsheet-like data. Each line in a CSV file represents a row in the spreadsheet, and commas separate the cells in the row. They usually have the .csv
extension and can be opened in spreadsheet programs such as Excel or in any text editor.
Here is the content of a simple CSV file (click here to download a copy) and how it looks like when opened in Excel.
| → |
If a value itself contains a comma e.g., Foo, Emily
, it can be enclosed in double quotes e.g., "Foo, Emily"
, to prevent it being misinterpreted as multiple values.
This example shows how to use double quotes to handle commas inside a value:
7/11/2017,"Foo, Emily",5
interpreted as three values:7/11/2017
andFoo, Emily
and5
7/11/2017,Foo, Emily,5
interpreted as four values:7/11/2017
andFoo
andEmily
and5
Python has an in-built module named csv
that provides functions to deal with CSV files more conveniently, although CSV files are text files that can be read/written using normal file access techniques covered earlier. For example, Python provides a way to read a CSV file as a Reader
object that knows how to interpret a CSV file.
The code below shows how to use the csv
module to read contents of a CSV file named deliveries.csv
:
import csv
deliveries_file = open('deliveries.csv') # open file
deliveries_reader = csv.reader(deliveries_file) # create a Reader
for row in deliveries_reader: # access each line using the Reader
print(row)
deliveries_file.close() # close file
['4/11/2017', 'Alice Bee', '4']
['5/11/2017', 'Chris Ding', '12']
['5/11/2017', 'Brenda Chew', '13']
['6/11/2017', 'Dan Pillai', '5']
As you can see, Reader
object returns content of a line as a list object with the value of each cell as an item in the list. Replacing the line,
...
print(row)
...
... with the following line,
...
print('Date:', row[0], '\tRecipient:', row[1], '\tQuantity:', row[2] )
...
... will give you the output shown below:
Date: 4/11/2017 Recipient: Alice Bee Quantity: 4
Date: 5/11/2017 Recipient: Chris Ding Quantity: 12
Date: 5/11/2017 Recipient: Brenda Chew Quantity: 13
Date: 6/11/2017 Recipient: Dan Pillai Quantity: 5
Note that all values read from a CSV files come as strings. If they are meant to represent other types, you need to convert the string to the correct type first.
In this example the 3rd value of each row is converted to an int
before adding them up.
deliveries_file = open('deliveries.csv')
deliveries_reader = csv.reader(deliveries_file)
total = 0
for row in deliveries_reader:
# convert 3rd cell to an int and add to total
total = total + int(row[2])
print('Total quantity delivered:', total)
deliveries_file.close()
Total quantity delivered: 34
The csv
module also provide an easy way to write to CSV files, one row at a time, using a Writer
object.
The code below writes two rows to the pricelist.csv
file.
output_file = open('pricelist.csv', 'w', newline='') # open file in write mode
output_writer = csv.writer(output_file) # get a Writer object
output_writer.writerow(['apples', '1', '1.5', 'True']) # write one row
output_writer.writerow(['bananas', '3', '2.0', 'False']) # write another row
output_file.close() # close file
The pricelist.csv
file will now contain:
apples,1,1.5,True
bananas,3,2.0,False
- You can open a file in append mode if you want to append to it instead of overwriting current content.
e.g.,output_file = open('pricelist.csv',
'a',
newline='')
- The keyword argument
newline=''
need to be used when opening a CSV file in Windows. The reasoning behind it is too complicated to explain here.