CS 453 (Spring 2011): Computer Networks
Programming Assignment 1
Instructor : V. Arun


Assigned : Feb 16, 2011
Due : Mar 9, 2011

1. Overview

This assignment is designed to introduce you to socket programming. The assignment asks you to implement a program to download files using a traditional client/server approach (like HTTP) and compare it to a peer-to-peer approach (like BitTorrent).

Your goal is to write a network program, referred to as the client, that downloads an image file from a server that we maintain. The client must implement two options:
  1. Client/server: Request the server for the entire file similar to HTTP.
  2. Peer-to-peer: Request the server for addresses of other peers that possess parts of the file, called blocks, and download these blocks from different peers.
Your client needs to download the image as fast as possible. Note that the first option will not yield the fastest download as the server  (as well as each of the individual peers) services requests at a constrained rate, so relying on only one server (or peer) to get the entire file is likely to yield a poor download rate.

The rest of this document specifies (1) the protocol for the client/server option, (2) the protocol for the peer-to-peer option, (3) design constraints, hints and suggestions, and submission instructions.



2. Client/server option

In the client/server option, the client requests the server for the entire file using TCP similar to HTTP. The server is running on the following <IP address, port number> combinations: <128.119.41.52, 18765> and <128.119.245.8, 18765>. You may use either one of the two servers. We support two servers just for fault-tolerance (and to defend against unintentional DoS attacks when everyone decides to stress-test their client the weekend before the assignment is due).

File request format: The client must send a request in the following format to the server to download filename, where the ‘\n’ at the end is the newline character.
        GET filename\n

The filename to use for this assignment is Redsox.jpg. Thus, the client must send the following string to request the file.
        GET Redsox.jpg\n
 
File response format: The response to the file request has the following format: 
       <block_offset(4 bytes)><block_size(4 bytes)><Data>

That is, the first four bytes contain the offset of the first byte in the block, the next four bytes contain the block size, followed by the data in the block. In the client/server option, the entire file is sent as one block, so the first field will be 0, the second field will be the size of the file, followed by data of length specified by the second field.



3. Peer-to-peer option


In the peer-to-peer option, the client must first obtain the “torrent metadata” from the tracker. The torrent metadata contains information about the number and size of blocks constituting the file and peers (<IP,port> combinations) from which the blocks may be downloaded.


3.1. Torrent metadata


Torrent metadata request format: The client must download the torrent metadata for filename by sending a UDP message in the following format:
        GET filename.torrent

Thus, to request the torrent metadata for Redsox.jpg, the client must send a UDP message containing the string “GET Redsox.jpg.torrent” (with no newline).

This UDP-based torrent metadata server is running at <128.119.41.52, 19876> and <128.119.245.8, 19876>. As above, you may use either one of the two.

Torrent metadata response format: The response to the request for torrent metadata is in the following format:
        <number_of_blocks(4 bytes)><size_of_the_entire_file(4 bytes)><IP1(4 bytes)> <port1(4 bytes)><IP2(4bytes)><port2(4 bytes)>

The names of the fields above are self-explanatory. number_of_blocks is the number of blocks in the requested file. size_of_the_file is the size of the entire file in bytes. All blocks except the last block are of the same size, namely, floor[size-of-file/number-of- blocks]. <IP1> and <port1> identify the IP address and port number of the first peer. Similarly, <IP2> and <port2> identify the second peer.

Each response will contain two randomly chosen peer identifiers. You can query the tracker multiple times to get more peer identifiers. However, the tracker is designed to rate-limit the queries, so you may not get responses promptly if you send requests too fast or may not get responses at all as UDP messages can get lost.


3.2. Data blocks

Data blocks must be requested using TCP as follows.

Request format: The following request fetches a specific block
        GET filename <block-number>\n

where block-number is an integer identifying the block in filename. For example, you may request block 24 in Redsox.jpg by sending the string “GET Redsox.jpg 24\n” to any one of the peers received in the torrent metadata above. Note that the servers listed in the client/server option also act as peers and support the above request format to request specific blocks.

Specifying `*’ instead of a block number returns a randomly chosen block
        GET filename *\n

Response format: The response to a block request has the following format:
        <block_offset(4 bytes)><block_size(4 bytes)><Data>

As before, <block_offset> identifies the position of the first byte in the block, <block_size> is its length, followed by the actual <Data> of that length. All blocks except the last block are of the same size.



4. Design constraints


Your objective is to download the file as fast as possible. However, you must do so while respecting the following constraints. Not respecting these constraints will be considered cheating.
  1. You can maintain only one open connection to each port discovered. You can download multiple blocks over this connection. You can also close this connection and open a new one to the same port.
  2. You must not try to probe for random ports (for example, by searching iteratively through the port address space) to discover more ports.
  3. You must not remember ports from previous runs and use them by hard-coding them in your program.

4.1. Hints and suggestions
  1. We encourage you to use multiple threads to simultaneously download blocks from different peers. Remember to carefully synchronize access to shared data structures when using threads. If you have not taken an Operating Systems course or do not otherwise have experience using threads, this part will incur a steep learning curve.
  2. A separate utility class, Utility.java,  has been provided so that you can focus on the networking aspects of the project. This class has convenient methods for: (i) converting a byte array to an integer; (ii) converting an integer to a byte array; (iii) converting a byte array to a JPEG image file.
  3. Plan to complete the assignment well before the due date. Successful execution of the entire program to download the file takes about 30 min roughly. So, debugging and testing may take longer than expected.
  4. Comment your code as much as possible. Even if there are minor errors the comments will convey your logic and hence help you get more points.
  5. Remember to substitute filename in the above commands with Redsox.jpg. An easy way to check if you got all the bytes in the file correctly is to view the downloaded image.

4.2. Things to keep in mind

  1. All data transfer uses TCP, but the torrent metadata uses UDP.
  2. Use correct <IP address, port> combinations.
  3. You may use DataOutputStream and writeBytes for sending out the requests, and DataInputStream for receiving responses, similar to the in-class example. You are encouraged to explore and use other commands supported by the Java socket API.
  4. You may use methods in the InetAddress class to parse the IP address from the torrent response.
  5. Blocks are numbered starting from 1. Block offsets (the position of the first byte of the block in the file) start from 0.
  6. Each block is of size 10000 bytes. You will also see this in the torrent metadata response message. The total size of the file Redsox.jpg is 805975 bytes. There is also a smaller file test.jpg of 58241 bytes for quickly testing your code. You can use all the supported commands to download either file.
  7. Make sure your client is resilient to long lived connections and delays in data reception.
  8. You can use ‘telnet’ as we saw in class to check that the servers actually work. For example, you can do telnet <IP> <port> and send commands such as GET Redsox.jpg\n or GET Redsox.jpg 3\n and the server will return the corresponding data (in binary) on the command line.

4.3. Submission instructions

  1. You must submit your client program. Name the main file TorrentClient.<appropriate-extension>. Feel free to code up additional helper classes to this main file. 
  2. You can use any programming language of your choice. The nice part about network programming using the socket API is that the server and client may be written in different programming languages.
  3. We suggest the following command line format for running the client in the client/server or P2P mode. For example, if you are using java, use
            java TorrentClient CS <IP> <port>
            java TorrentClient P2P <IP> <port>
    where “CS” means that the client will use the client/server option and “P2P” means that the client will use the P2P option. The IP and port are the IP and port of the (TCP-based) file server in the former case and the UDP-based torrent metadata server in the latter.
  4. Write a design document outlining your algorithm and the test cases used to test the client. Name this document <firstname.lastname>-PA1-design.<appropriate- extn>.For example John smith submitting the design doc as pdf will submit as john.smith-PA1-design.pdf
  5. Attach output of your program executing. The output may be a screen shot or the text output from your console. The output MUST include the total download time taken for the file to be downloaded 
  6. Name this file <last-name>-pa1-output.<appropriate-extension>. For example John smith submitting a screenshot as a JPEG will submit as john.smith-pa1- output.jpg
  7. Zip all the above mentioned items into a *SINGLE* zip file and submit. The zip file must be of the form <last-name>-pa1.<extension>

4.4. Grading Scheme
  1. Design document (25 points): The design document should clearly explain your strategy and why you expect the strategy to work well. The document should also explain test cases that you used to verify that your code works correctly and downloads fast as expected under different network conditions. If you made any assumptions about the server not specified above, you should explicitly state these assumptions.
  2. Code implementation and documentation (35 points). 
  3. Correct execution (40 points). We will run your code and view your downloaded file to check this.