Lab 10: Network Socket Programming with Python
In this lab, you will use the Python programming language to implement a simple HTTP client that can download files from a webserver.
Install the latest (3.2+) version of Python. Typically, Ubuntu distributions only come with the older (but still widely used) 2.x Python. We don't care about backwards compatibility, so full speed ahead!
Also, install the image viewer eog, used by the image download program later in the lab. (This viewer is present on standard Ubuntu distributions, but not all the variants).
unix> sudo apt-get install python3 eog
Verify that Python is working. Ask it to print the version number, which will be useful in debugging errors/quirks later: Note: This is a CAPITAL letter V!
unix> python3 -V
The output should be similar to: Python 3.2.3
There are a number of official and un-official references for Python that you may find useful. Be warned, however, that Python syntax and libraries have changed between version 2.x and 3.x. Thus, for this lab, you should prefer newer Python 3.x references where possible.
- Official - Python documentation (for Python 3.x)
- Official - Python Standard Library documentation (for Python 3.x)
- Official - Python sockets documentation (for Python 3.x)
- TutorialsPoint Python Examples
- Differences between Python 2.x and 3.x
- Google University Python Tutorial (written for the older Python 2.7 syntax, but still useful)
- Python Sockets Examples (written for the older Python 2.7 syntax, but still useful)
To begin this lab, start by obtaining the necessary boilerplate code. Enter the class repository:
unix> cd ~/bitbucket/2012_fall_ecpe170_boilerplate/
Pull the latest version of the repository, and update your local copy of it:
unix> hg pull
unix> hg update
Copy the files you want from the class repository to your private repository:
(In this case, it is one file you want)
unix> cp ~/bitbucket/2012_fall_ecpe170_boilerplate/lab10/download.py ~/bitbucket/2012_fall_ecpe170/lab10/
Enter your private repository now, specifically the lab10 folder:
unix> cd ~/bitbucket/2012_fall_ecpe170/lab10
Add the new files to version control in your private repository:
unix> hg add download.py
Commit the new files in your personal repository, so you can easily go back to the original starter code if necessary
unix> hg commit -m "Starting Lab 10 with boilerplate code"
Push the new commit to the bitbucket.org website
unix> hg push
(1) All source code and lab report PDF must be submitted via Mercurial. Place the source files inside the lab10 folder that was previously created.
Lab Part 1 - Hello World
Using your favorite text editor, create a file (hello.py) with the usual "Hello World" starter code:
Mark the file as "executable" so you can run it as a program:
unix> chmod +x hello.py
Now execute the Python program:
Check hello.py into version control when finished.
(1) What is the line that starts with #! doing? Where in ECPE 170 have you seen this before?
Lab Part 2 - Python Basic Skills
Write a Python3 program called demo.py. This program should be invoked via:
unix> ./demo.py <word1> <word2>
Demonstrate your knowledge of fundamental Python skills by performing the following operations in the program:
- Determine how many arguments have been provided to the script on the command line. If there are two arguments (*not* including the program name itself), print them out one at a time. Otherwise, exit immediately.
- Concatenate the two string arguments together, save them to a new variable called onestring, and then print onestring.
- Using a for-loop, write a sequence of numbers from 1 to 10 in increments of 1 to a file on disk.
Note: This demo program can be very short! No need to get fancy - save that for the HTTP download client.
Check demo.py into version control when finished.
Lab Part 3 - HTTP Basic Skills
Before writing a program that communicates with an HTTP server, you are going to manually test your knowledge of HTTP. The telnet client program allows you to open a TCP socket to a port and send ASCII characters. A telnet server normally listens on port 23, which the telnet client uses as a default. However, we can specify that the telnet client use any port number we want, such as port 80 to communicate with a web server.
To invoke the Telnet client:
unix> telnet www.google.com 80
Once the connection to the web server is open, you can send an HTTP request. Here is an example HTTP request to download the file at http://www.google.com/about/
GET /about/ HTTP/1.1
<<SERVER RESPONSE STARTS HERE>>
Note that the HTTP client (in this case, you!) must send an extra blank line after the last request line. This trigger tells the web server to begin processing the request. (Technically, the web server is looking for a \r\n\r\n sequence of characters). After the request is sent, the reply should immediately follow on the same connection.
(2) Document the HTTP request and the server response when you manually download the HTML file at http://ecs-network.serv.pacific.edu/ecpe-170/labs/ via Telnet.
(By "document", you should provide the full client request and a partial server response (top 40-50 lines is sufficient for me to tell if you downloaded the right file). The script utility can make this capture easy for you - see below.)
(3) Document the HTTP request and the server response when you manually download the HTML file at http://www.yahoo.com/ via Telnet
(4) Document the HTTP request and the server response when you manually download the PNG image file at http://www.google.com/images/logos/google_logo_41.png via Telnet
Note: Is there a good reason why it doesn't make sense to include the server response (at least, the data portion) in your lab report?
Requirements for your HTTP request:
- Use the HTTP 1.1 protocol
- Specify the Host field, which is the domain name of the server that should answer your request. (In HTTP/1.1, there could be multiple servers -- for example, gmail.google.com and www.google.com -- listening on the same IP address).
- Specify that the web server close the socket connection immediately after sending the requested file. (This allows for a more simple client implementation.)
Tip: Want to use the script utility to make documentation easy? The following command will tell script run the command "telnet www.google.com 80" interactively, save all keyboard input and program output to the file easy_documentation.txt, and stop saving when the telnet program exits.
unix> script -c "telnet www.google.com 80" easy_documentation.txt
Lab Part 4 - HTTP Download Client
Write a Python3 program called download.py to retrieve files from a web server via HTTP. Although this program could retrieve files of any type, we will use it solely to retrieve image files, and then display them after they have been downloaded. Your program will take 1 argument, the full URL of the image to display, e.g.:
unix> ./download.py http://www.google.com/images/logos/google_logo_41.png
Python has a built-in HTTP client module. Be warned, however: You *cannot* use it for this lab, and zero points will be awarded if you do! The reason for not using this module is because it hides how the HTTP protocol works, and the purpose of this lab is to actually learn about the protocol operation. Instead, you must use the lower-level socket module.
Tip: There is substantial boilerplate code provided for this exercise, and further instructions and hints contained within.
Your program will be tested with the following image URLs:
Lab Report - Wrapup:
(1) What was the best aspect of this lab?
(2) What was the worst aspect of this lab?
(3) How would you suggest improving this lab in future semesters?