Fork me on GitHub

Welcome

As I continue my programming adventures in Rust, I’ve decided to launch yet another learning project. This time, it’s going to be an implementation of a simple BitTorrent client.

Motivations

Why BitTorrent? One reason is that I’ve always been interested in how peer-to-peer systems work. There’s something very intriguing about how multiple actors can collaborate to accomplish a task without the need for centralized control. By diving deeper into the implementation of such a system, I hope to gain a better understanding of how they work in general.

Another reason is that I expect a project like this will help me deepen my experience with Rust. In particular, I would like to become more familiar with these areas that I haven’t explored yet:

  • Network programming
  • Multi-threaded programming
  • Programming console UIs

Project scope

Writing a fully-fledged BitTorrent client is quite a big task, so for my pet project I’d like to scale it down to the essentials. I will consider the project accomplished when my solution is able to do the following:

  • Connect to the torrent tracker to fetch the initial information about the file to download
  • Download the file from multiple peers in parallel
  • Serve requests from other peers while the download is ongoing
  • Show the download progress in some form of text-based UI

It should be noted that I’m starting this project knowing absolutely nothing about the BitTorrent protocol. Of course, I have some experience with various BitTorrent clients as a user, but I have absolutely no idea how they work under the hood. But hey, that’s what this project is all about: getting into the nitty-gritty details!

Obviously, I’m not the first person to implement a BitTorrent client. There are plenty of resources on the Web dedicated to this subject. Here, I’m going to put up a list of those that I use in the course of this project. The list will be updated as I move along.

Project diary

  • BitTorrent: Key concepts

    Let’s start with an overview of the key concepts of BitTorrent architecture to gain a high-level understanding of what we’re dealing with.

  • First step: Fetching the announce URL

    As mentioned in the previous post, BitTorrent clients start the download by first making a request to the torrent tracker using the announce URL to retrieve the list of available peers. The announce URL is taken from the torrent file. It seems that my very first task should be to parse the torrent file and extract the announce URL.

  • Make HTTP request to the tracker

    We left off our project at the point where we managed to parse the torrent file (at least partially) and extract the tracker’s announce URL from it. Now, it’s time to make use of this URL and write some code to send an HTTP request to the torrent tracker.

  • Obtaining the list of peers

    In the last chunk of work, I achieved a milestone: making a request to the torrent tracker and getting back a meaningful response. The response is “torrent not found,” though, because I’m still passing a fake torrent hash in the request parameters. Today, I want to fix this situation and make the program use the real hash value that it will obtain from the torrent file.

  • Parsing the peer list from tracker

    Last time, we managed to fetch the list of peers from the torrent tracker for our sample torrent file. I left off by simply dumping the response from the torrent tracker onto the screen, and now I would like to pick up on that and actually parse the tracker’s response so that we can get our hands on peers’ IP addresses and ports. That’s going to be our next step towards making connections to peers.

  • Time for reflection

    I’ve made some progress with communicating with the torrent tracker so far, and I’m ready to dive into the details of peer-to-peer communication. However, I have some doubts about what problem to tackle next. Should I dive into the technical details of peer-to-peer communication, or should I polish the code I already have? Or maybe I should look into areas I haven’t even touched yet? I’d like to step back and look at the bigger picture.

  • Shaking hands with peers

    Our last significant achievement was getting the list of peer IP addresses and ports from the torrent tracker. This is where the tracker’s job ends, essentially. From here on, all communication happens directly between peers via TCP protocol. Our first task in this peer-to-peer exchange is to connect to peers and perform the initial handshake.

  • Downloading some data from the peer

    Last time we left off being able to exchange the initial handshake messages with some of the peers. By the end of the session, I had some doubts about what to do next: should I continue working on peer-to-peer communication, or should I elaborate on the code that we already had working? After some reflection, I realized that peer communication was more interesting to move forward with.

  • Downloading the whole piece

    Now that we’re able to download a portion of the file from the remote peer, I’m tempted to milk this cow dry. Let’s now download the entire file piece and verify its validity by checking its SHA-1 hash.

  • Downloading the entire file

    We’ve managed to download and verify a single piece. After that, extending the code to download the entire file is quite a straightforward progression. Since we know how many pieces there are, we can simply download them one by one, similar to how we were downloading a single piece in 16Kb blocks. Only the last piece would require special care, because it can be shorter than the others.

  • Low download speed: looking into the issue

    We have reached the milestone when we can download the entire file, but the download speed was frustratingly low. In this section, I’d like to explore this issue and try out some experiments to eliminate the bottleneck. To make the experimentation more reliable, I’m going to set up a BitTorrent client locally so that our investigation is not influenced by random network delays and unpredictable remote peer settings.

  • Speeding up downloads: pipelining requests

    So, our previous experiments have shown that request pipelining does in fact improve the download speed. Now we can move forward and create a proper implementation for it.

  • Selecting a peer with the complete file

    In our current implementation, we communicate with only one peer to download the file. It imposes some additional restrictions upon which peer we’re going to select. In this section, I discuss these restrictions and their implementation in code.

  • Discovering Serde

    At the beginning of this project, I implemented a simple parser for .torrent files. It was an interesting exercise to get familiar with bencoding format. However, there is already an implementation for parsing bencoded data that comes as an extension to a popular Rust deserialization library called Serde. I think it is a good opportunity to get familiar with this library and switch to using Serde for working with bencoded data.

  • Better logging with Tracing

    Yet another thing that has interested me for a while was how to approach logging in a Rust application. Until now, I was just relying on println() macro to display significant events in my torrent client. However, this is a poor-man’s solution: in production applications, you don’t want to rely on println() statements for logging: your approach to logging should be more systematic.

  • Terminal UIs: starting with Ratatui

    Having added tracing to the application, I’ve got a lot of visibility into what’s going on under the hood. However, it’s not exactly user-friendly: parsing the tracing output is no fun at all. I think it’s time to pay more attention to the application’s user interface. In particular, I’m interested in developing a terminal user interface application, inspired by many popular Linux command-line tools, such as htop.

  • Ratatui: implementing background tasks

    In the previous post, I explored a structure of a simple interactive Ratatui application. However, the BitTorrent client presents additional challenges: the driver of the application is the download process that works outside the UI-driven render loop. In this section, I’m laying the groundwork for a terminal UI application that does most of its work in the background.

  • Ratatui: error handling in background tasks

    I was just in the middle of connecting the code of the application to the UI, when suddenly I realized that I skipped a very important topic: how are we supposed to handle errors that may occur in a background task? In particular, if the download fails, how should we react to it? The UI implementation I started in the previous post simply ignored the fact that a background task can fail. That realization made me step back and reason about error handling more thoroughly.

  • Ratatui UI: Connecting the dots

    I’ve done most of the work in the previous section, so connecting the code that manages UI to the main download logic has become quite an easy change. I’ll briefly highlight the most notable changes in this section. For implementation details, check out the version on GitHub.

  • First integration test

    I’m about to start doing some serious changes to the core logic of the download process. However, before diving in head-first, I would like to strengthen my test suite by introducing the first integration test that would utilize a real BitTorrent client as a remote peer. Moreover, I would like to create a controlled test environment, so that test execution will not rely on anything that lives somewhere else on the internet and therefore is outside of my reach.

  • Non-blocking I/O: Connect to multiple peers concurrently

    In this section, I’m taking a first shot at dealing with multiple peers concurrently using non-blocking I/O. We’ll revisit the process of connecting to remote hosts and try to eliminate the biggest time-sucker we’ve had so far: connection timeouts.

subscribe via RSS