Binary file download in C# .Net with speed limiter

Downloading files over HTTP is easy in C#/.Net – in fact, it’s as easy as writing a single line of code:

new WebClient().DownloadFile("http://your.url", "saveAs_Filename.jpg");

However, there’s no .Net library function that allows you to limit the download speed. Here’s how you can write your own.

What you’ll need is at its core a simple TCP client. It will connect to a HTTP server, send a GET command, then receive the reply packets from the server until the server stops sending. This allows us to add speed measurement and speed limiting.

If you don’t care about the details, scroll down to the end of the article for the full demo source file.

Reading the raw reply stream

static void DownloadBinaryFile(string hostname, string url, string filename)
{
var writer = new BinaryWriter(File.OpenWrite(filename));
var request = "GET " +url+ " HTTP/1.1\r\n" + "Host: " +hostname+ "\r\n\r\n";
TcpClient client = new TcpClient(hostname, 80);
Byte[] data = System.Text.Encoding.ASCII.GetBytes(request);
NetworkStream stream = client.GetStream();
stream.Write(data, 0, data.Length);
bool completed = false;
Int32 packetLength = 0;

do
{
data = new Byte[256];
packetLength = stream.Read(data, 0, data.Length);
Console.WriteLine(
System.Text.Encoding.ASCII.GetString(data, 0, packetLength));
writer.Write(data, 0, packetLength);
if (packetLength == 0) completed = true;
} while (completed == false);

writer.Flush();
writer.Close();
stream.Close();
client.Close();
}

...
DownloadBinaryFile("wikimedia.org",
"https://upload.wikimedia.org/wikipedia/en/b/bc/Wiki.png", "test.png");

This will print the received packets to the console and write them to a file.

Try to open the file and you will probably get an error message that the file is invalid. This is because the complete server reply was written to the file. It will look like this:

HTTP/1.1 200 OK
...
Content-Type: image/jpeg
...
Content-Length: 6407935
Connection: Keep-Alive

[Binary fragments...]

The server reply consists of the HTTP header that contains various bits of information, some of which will be useful later. It is followed by an empty line. Specifically, a combination of the bytecodes for “\r\n\r\n”, or two linebreaks (one of which would be a single “\r\n”).

Everything after the empty line to the end of the server reply will be the binary data of the file. Be aware that this binary data might contain another instance of the “empty line” bytecodes. This would be part of the binary data, not a control sequence.

Filtering out the HTTP header

The first step is to filter out the header and only write the binary reply to the target file. Since the data is received in packets with no discrimination between header and binary data, the flow of data will look like this:

– The first packet or packets will contain the HTTP header as clear text.
– The end of the clear text part will be marked by the \r\n\r\n linebreak sequence.
– Therefore, a packet will either contain only header information, a combination of header / linebreak / binary data, or just binary data.
– Any packet received after the linebreak packet will only contain binary data.

Any packet that contains only header info can be dropped. Any packet that contains only binary data will be written to file. The interesting part is the packet that contains the linebreak sequence. You will have to find the position of the linebreak sequence and only write the part to file that comes after it.

Also, the linebreak sequence may happen just at the end of a packet so no single packet would contain the full sequence. In order to safely handle this case, you can concatenate the previous packets and the current packet together, then check for the sequence. The following code concatenates all header packets, since we’ll be needing them later. Once we’ve found the linebreak packet, concatenating is no longer necessary.

static int GetLinebreakPosition(Byte[] source)
{
var rnrn = Encoding.ASCII.GetBytes("\r\n\r\n");
for (int i = 0; i < source.Length - 3; i++) { if (source[i] == rnrn[0] && source[i + 1] == rnrn[1] && source[i + 2] == rnrn[2] && source[i + 3] == rnrn[3]) { return i + 4; } } return -1; }
static Byte[] TruncateData(Byte[] source, int start, int end)
{
List tempresult = new List();
for (int i = start; i < end; i++) { tempresult.Add(source[i]); } return tempresult.ToArray(); }
static Byte[] ConcatArrays(Byte[] source1, Byte[] source2)
{
List tempresult = new List();
foreach (var b in source1) { tempresult.Add(b); }
foreach (var b in source2) { tempresult.Add(b); }
return tempresult.ToArray();
}

static void DownloadBinaryFile(string hostname, string url, string filename)
{
var writer = new BinaryWriter(File.OpenWrite(filename));
var request = "GET " + url + " HTTP/1.1\r\n" + "Host: " + hostname + "\r\n\r\n";
TcpClient client = new TcpClient(hostname, 80);
Byte[] data = System.Text.Encoding.ASCII.GetBytes(request);
NetworkStream stream = client.GetStream();
stream.Write(data, 0, data.Length);
bool completed = false;
Byte[] headerdata = new Byte[0];
bool isInHeader = true;

do
{
data = new Byte[256];
var packetLength = stream.Read(data, 0, data.Length);

if (isInHeader == true)
{
headerdata = ConcatArrays(headerdata, TruncateData(data, 0, packetLength));
var linebreakPosition = GetLinebreakPosition(headerdata);

if (linebreakPosition != -1)
{ // this packet contains the linebreak!
isInHeader = false;
//Write the temporary data back to the buffer,
//but only the part after the linebreak
data = TruncateData(headerdata, linebreakPosition, headerdata.Length);
packetLength = headerdata.Length - linebreakPosition;
}
}

if (isInHeader == false) writer.Write(data, 0, packetLength);
if (packetLength == 0) completed = true;
} while (completed == false);

writer.Flush();
writer.Close();
stream.Close();
client.Close();
}

This should result in a valid file that only contains the downloaded binary data.

Handling the “Content-Length”

You will notice that the download function will hang at the end. This is because the download is ended when an empty packet is received. However, the last packet that’s received from the server won’t be empty – instead, the client will continue to listen for another packet. Then, after a timeout is reached, the connection will collapse (by writing an empty packet, thus ending the download somewhat gracefully).

The elegant way is to stop listening for packets once the content has been received and avoid the timeout. The “Content-Length” field of the HTTP header contains the total number of bytes of the binary part.

Here’s an implementation that uses regular expressions to filter out the content length from the concatenated header packages that were collected before.

static int GetContentLength(Byte[] headerdata)
{
try
{
Regex ItemRegex = new Regex("Content-Length: (\\d+)");
Match match = ItemRegex.Match(System.Text.Encoding.ASCII.GetString(headerdata));
if (match.Success)
{
return Int32.Parse(match.Groups[1].Value);
}
}
catch { }
return 0;
}

Once we’ve detected the end of the header, we can retrieve the length of the content. Then all we have to do is to count the bytes written to disk. If reading the content length fails, we can still use the old method and end the download by waiting for the timeout.

...
int bytesDownloaded = 0;
int contentlength = 0;

do
{
data = new Byte[256];
var packetLength = stream.Read(data, 0, data.Length);
if (isInHeader == true)
{
headerdata = ConcatArrays(headerdata, TruncateData(data, 0, packetLength));
var linebreakPosition = GetLinebreakPosition(headerdata);
if (linebreakPosition != -1)
{ // this packet contains the linebreak!
isInHeader = false;
//Write the temporary data back to the buffer,
//but only the part after the linebreak
data = TruncateData(headerdata, linebreakPosition, headerdata.Length);
packetLength = headerdata.Length - linebreakPosition;
contentlength = GetContentLength(headerdata);
}
}

if (isInHeader == false)
{
bytesDownloaded += packetLength;
writer.Write(data, 0, packetLength);
}
if (packetLength == 0) completed = true;
if (contentlength != 0 && bytesDownloaded==contentlength) completed = true;
} while (completed == false);

Measuring and limiting the download speed

Measuring the download speed is as simple as counting the number of bytes that are downloaded in a specific timeframe. If the number exceeds the speed limit threshold, issue a Sleep() command to stop downloading for the rest of the timeframe.

A word about network speed limit granularity. There is no such thing as a byte that is transmitted with 10% or 90% speed. Any packet you send or receive will use 100% of your network capacity at that moment. Therefore, the choice of length of the reference timeframe is important.

A shorter timeframe will provide less measurement resolution – that is, to have a resolution of 10%, you will have to measure a timespan that would allow a minimum of 10 packets to pass through, and for a resolution of 0.1%, this timespan needs to be at least 1000 packets long.

However, a greater length of the measurement timeframe also means less resolution for the speed limiter. Let’s say your timeframe is 10 seconds long and you want to limit the transmission speed to 50%. This means that your client would be downloading full speed for 5 seconds and hog the entire bandwidth, then shut off for another 5 seconds. This behaviour would be highly undesirable.

Additionally, the measurement resolution is relative to the total bandwidth. Let’s say you have a very slow connection that can only transmit a packet per second. To achieve a resolution of 0.1%, your timeframe would have to be 1000 seconds long. You could try to reduce the packet buffer size down to the point where you handle single bytes, but this would result in an enormous processing overhead.

Long story short: there’s no silver bullet to precisely limit network speed. Choose a timeframe that works for you. In this implementation, I chose 100 milliseconds.

...
int bytesDownloaded = 0;
var contentlength = 0;
var sw = new Stopwatch();
sw.Start();
int bytesPer100msecs = 0;

do
{
data = new Byte[256];
var packetLength = stream.Read(data, 0, data.Length);

if (isInHeader == true)
{
headerdata = ConcatArrays(headerdata, TruncateData(data, 0, packetLength));
var linebreakPosition = GetLinebreakPosition(headerdata);

if (linebreakPosition != -1)
{ // this packet contains the linebreak!
isInHeader = false;
//Write the temporary data back to the buffer,
//but only the part after the linebreak
data = TruncateData(headerdata, linebreakPosition, headerdata.Length);
packetLength = headerdata.Length - linebreakPosition;
contentlength = GetContentLength(headerdata);
}
}

if (isInHeader == false)
{
bytesDownloaded += packetLength;
bytesPer100msecs += packetLength;

var elapsedMilliseconds = sw.ElapsedMilliseconds;
if (elapsedMilliseconds >= 100)
{
var progressPercentage = ((double)bytesDownloaded / (double)contentlength) * 100;
Console.WriteLine("Progress: " + progressPercentage +
"%, Speed: " + (bytesPer100msecs/100.0)+" kb/s");

bytesPer100msecs = 0; //reset the byte counter
sw.Reset(); //reset the timeframe
sw.Start();
}

if (bytesPer100msecs > bytesPer100msecsLimit && elapsedMilliseconds < 100) { //make sure your sleep timespan is zero or positive. Thread.Sleep(TimeSpan.FromMilliseconds(100 - elapsedMilliseconds)); } writer.Write(data, 0, packetLength); } if (packetLength == 0) completed = true; if (contentlength != 0 && bytesDownloaded == contentlength) completed = true; } while (completed == false);

Running this in a console application will look something like this. You can see that the speed is jumping between 12 and 15 kb/s because the measurement resolution is low.

Download the full sourcecode

Download the full listing of the demo program. This is intended as a proof-of-concept only and lacks error handling. It will retrieve this picture of a capybara from Wikimedia and save it to disk:

by