CS
453 (Spring 2011):
Computer Networks
Programming Assignment 1
Instructor : V. Arun
Assigned : Feb 16, 2011
Due : Mar
9, 2011
1. Overview
This assignment is designed to introduce you to socket programming. The
assignment asks you to implement a program to download files using a
traditional client/server approach (like HTTP) and compare it to a
peer-to-peer approach (like BitTorrent).
Your goal is to write a network program, referred to as the client,
that downloads an image file from a server that we maintain. The client
must implement two options:
- Client/server: Request the server for the entire file similar to
HTTP.
- Peer-to-peer: Request the server for addresses of other peers
that possess parts of the file, called blocks, and download these blocks
from different peers.
Your client needs to download the image as fast as possible. Note that
the first option will not yield the fastest download as the server
(as well as each of the individual peers) services requests at a
constrained rate, so relying on only one server (or peer) to get the
entire file is likely to yield a poor download rate.
The rest of this document specifies (1) the protocol for the
client/server option, (2) the protocol for the peer-to-peer option, (3)
design constraints, hints and suggestions, and submission instructions.
2. Client/server option
In the client/server option, the client requests the server for the
entire file using TCP similar to HTTP. The server is running on the
following <IP address, port number> combinations: <128.119.41.52,
18765> and <128.119.245.8,
18765>. You may use either one of the two servers. We support
two servers just for fault-tolerance (and to defend against
unintentional DoS attacks when everyone decides to stress-test their
client the weekend before the assignment is due).
File request format: The client
must send a request in the following format to the server to download
filename, where the ‘\n’ at the end is the newline
character.
GET
filename\n
The filename to use for this assignment is Redsox.jpg. Thus, the client
must send the following string to request the file.
GET
Redsox.jpg\n
File response format: The
response to the file request has the following format:
<block_offset(4
bytes)><block_size(4 bytes)><Data>
That is, the first four bytes contain the offset of the first byte in
the block, the next four bytes contain the block size, followed by the
data in the block. In the client/server option, the entire file is sent
as one block, so the first field will be 0, the second field will be
the size of the file, followed by data of length specified by the
second field.
3. Peer-to-peer option
In the peer-to-peer option, the client must first obtain the
“torrent metadata” from the tracker. The torrent metadata
contains information about the number and size of blocks constituting
the file and peers (<IP,port> combinations) from which the blocks
may be downloaded.
3.1. Torrent metadata
Torrent metadata request format:
The client must download the torrent metadata for filename by sending a
UDP message in the following format:
GET
filename.torrent
Thus, to request the torrent metadata for Redsox.jpg, the client must
send a UDP message containing the string “GET
Redsox.jpg.torrent” (with no newline).
This UDP-based torrent metadata server is running at <128.119.41.52,
19876> and <128.119.245.8,
19876>. As above, you may use either one of the two.
Torrent metadata response format:
The response to the request for torrent metadata is in the following
format:
<number_of_blocks(4
bytes)><size_of_the_entire_file(4 bytes)><IP1(4 bytes)>
<port1(4 bytes)><IP2(4bytes)><port2(4 bytes)>
The names of the fields above are self-explanatory. number_of_blocks is the number of
blocks in the requested file. size_of_the_file
is the size of the entire file in bytes. All blocks except the last
block are of the same size, namely, floor[size-of-file/number-of-
blocks]. <IP1>
and <port1> identify the
IP address and port number of the first peer. Similarly, <IP2> and <port2> identify the second
peer.
Each response will contain two randomly chosen peer identifiers. You
can query the tracker multiple times to get more peer identifiers.
However, the tracker is designed to rate-limit the queries, so you may
not get responses promptly if you send requests too fast or may not get
responses at all as UDP messages can get lost.
3.2. Data blocks
Data blocks must be requested using TCP as follows.
Request format: The following
request fetches a specific block
GET
filename <block-number>\n
where block-number is an integer identifying the block in filename. For
example, you may request block 24 in Redsox.jpg by sending the string
“GET Redsox.jpg 24\n” to any one of the peers received in
the torrent metadata above. Note that the servers listed in the
client/server option also act as peers and support the above request
format to request specific blocks.
Specifying `*’ instead of a block number returns a randomly
chosen block
GET
filename *\n
Response format: The response
to a block request has the following format:
<block_offset(4
bytes)><block_size(4 bytes)><Data>
As before, <block_offset>
identifies the position of the first byte in the block, <block_size> is its length,
followed by the actual <Data>
of that length. All blocks except the last block are of the same size.
4. Design constraints
Your objective is to download the file as fast as possible. However,
you must do so while respecting the following constraints. Not
respecting these constraints will be considered cheating.
- You can maintain only one open connection to each port
discovered. You can download multiple blocks over this connection. You
can also close this connection and open a new one to the same port.
- You must not try to probe for random ports (for example, by
searching iteratively through the port address space) to discover more
ports.
- You must not remember ports from previous runs and use them by
hard-coding them in your program.
4.1. Hints and suggestions
- We encourage you to use multiple threads to simultaneously
download blocks from different peers. Remember to carefully synchronize
access to shared data structures when using threads. If you have not
taken an Operating Systems course or do not otherwise have experience
using threads, this part will incur a steep learning curve.
- A separate utility class, Utility.java,
has been provided so that you can focus on the networking aspects of
the project. This class has convenient methods for: (i) converting a
byte array to an integer; (ii) converting an integer to a byte array;
(iii) converting a byte array to a JPEG image file.
- Plan to complete the assignment well before the due date.
Successful execution of the entire program to download the file takes
about 30 min roughly. So, debugging and testing may take longer than
expected.
- Comment your code as much as possible. Even if there are minor
errors the comments will convey your logic and hence help you get more
points.
- Remember to substitute filename in the above commands with
Redsox.jpg. An easy way to check if you got all the bytes in the file
correctly is to view the downloaded image.
4.2. Things to keep in mind
- All data transfer uses TCP, but the torrent metadata uses UDP.
- Use correct <IP address, port> combinations.
- You may use DataOutputStream and writeBytes for sending out the
requests, and DataInputStream for receiving responses, similar to the
in-class example. You are encouraged to explore and use other commands
supported by the Java socket API.
- You may use methods in the InetAddress class to parse the IP
address from the torrent response.
- Blocks are numbered starting from 1. Block offsets (the position
of the first byte of the block in the file) start from 0.
- Each block is of size 10000 bytes. You will also see this in the
torrent metadata response message. The total size of the file
Redsox.jpg is 805975 bytes. There is also a smaller file test.jpg of
58241 bytes for quickly testing your code. You can use all the
supported commands to download either file.
- Make sure your client is resilient to long lived connections and
delays in data reception.
- You can use ‘telnet’ as we saw in class to check that
the servers actually work. For example, you can do telnet <IP>
<port> and send commands such as GET Redsox.jpg\n or GET
Redsox.jpg 3\n and the server will return the corresponding data (in
binary) on the command line.
4.3. Submission instructions
- You must submit your client
program. Name the main file
TorrentClient.<appropriate-extension>. Feel free to code up
additional helper classes to this main file.
- You can use any programming
language of your choice. The nice part about network programming
using the socket API is that the server and client may be written in
different programming languages.
- We suggest the following command line format for running the
client in the client/server or P2P mode. For example, if you are using
java, use
java TorrentClient CS <IP>
<port>
java TorrentClient P2P <IP>
<port>
where “CS” means that the client will use the client/server
option and “P2P” means that the client will use the P2P
option. The IP and port are the IP and port of the (TCP-based) file
server in the former case and the UDP-based torrent metadata server in
the latter.
- Write a design document outlining your algorithm and the test
cases used to test the client. Name this document
<firstname.lastname>-PA1-design.<appropriate- extn>.For
example John smith submitting the design doc as pdf will submit as john.smith-PA1-design.pdf
- Attach output of your program executing. The output may be a
screen shot or the text output from your console. The output MUST
include the total download time taken for the file to be
downloaded
- Using parallel downloads of the file’s constituent
blocks.
- Using a single download of the whole file.
- Name this file
<last-name>-pa1-output.<appropriate-extension>. For example
John smith submitting a screenshot as a JPEG will submit as
john.smith-pa1- output.jpg
- Zip all the above mentioned items into a *SINGLE* zip file and
submit. The zip file must be of the form
<last-name>-pa1.<extension>
4.4. Grading Scheme
- Design document (25 points): The design document should clearly
explain your strategy and why you expect the strategy to work well. The
document should also explain test cases that you used to verify that
your code works correctly and downloads fast as expected under
different network conditions. If you made any assumptions about the
server not specified above, you should explicitly state these
assumptions.
- Code implementation and documentation (35 points).
- Correct execution (40 points). We will run your code and view
your downloaded file to check this.