A few things everyone can do now:

  1. Please consider running a relay to help the Tor network grow.
  2. Do you have an Amazon account? Are you willing to spend up to $3 a month? Then spin up your own Tor bridge in less than 10 minutes with tor cloud!
  3. Tell your friends! Get them to run relays. Get them to run hidden services. Get them to tell their friends.
  4. If you like Tor's goals, please take a moment to donate to support further Tor development. We're also looking for more sponsors — if you know any companies, NGOs, agencies, or other organizations that want anonymity / privacy / communications security, let them know about us.
  5. We're looking for more good examples of Tor users and Tor use cases. If you use Tor for a scenario or purpose not yet described on that page, and you're comfortable sharing it with us, we'd love to hear from you.

Documentation

  1. Help translate the web page and documentation into other languages. See the translation guidelines if you want to help out. We especially need Arabic or Farsi translations, for the many Tor users in censored areas.
  2. Evaluate and document our list of programs that can be configured to use Tor.
  3. We have a huge list of potentially useful programs that interface to Tor. Which ones are useful in which situations? Please help us test them out and document your results.

Advocacy

  1. Create a presentation that can be used for various user group meetings around the world.
  2. Create a video about the positive uses of Tor, what Tor is, or how to use it. Some have already started on Tor's Media server, Howcast, and YouTube.
  3. Create a poster, or a set of posters, around a theme, such as "Tor for Freedom!".
  4. Create a t-shirt design that incorporates "Congratulations! You are using Tor!" in any language.

Projects

Below are a list of Tor related projects we're developing and/or maintaining. Most discussions happen on IRC so if you're interested in any of these (or you have a project idea of your own), then please join us in #tor-dev. Don't be shy to ask questions, and don't hesitate to ask even if the main contributors aren't active at that moment.

Name Category Language Activity Contributors
Tor Core C Heavy nickm, arma, Sebastian
*JTor Core Java None
TBB Usability Sys Admin Moderate Erinn
Tails Usability Sys Admin Heavy #tails
Torsocks Usability C None mwenge
*Torouter Usability Sys Admin Light ioerror, Runa
Vidalia User Interface C++, Qt Light chiiph
Arm User Interface Python, Curses Light atagar
Orbot User Interface Java Moderate n8fr8
Torbutton Browser Add-on Javascript Moderate mikeperry
Obfsproxy Client Add-on C Moderate nickm, asn
*Thandy Updater Python Light chiiph, Erinn, nickm
*Ooni Probe Scanner Python Moderate hellais, ioerror
Shadow Experimentation C, Python Moderate robgjansen
TorCtl Library Python Light mikeperry
*Stem Library Python Heavy atagar
Metrics Client Service Java Heavy karsten
Atlas Client Service JavaScript Moderate hellais
TorStatus Client Service Python, Django None
Weather Client Service Python None kaner
GetTor Client Service Python None kaner
TorCheck Client Service Python, Perl None
Onionoo Backend Service Java Moderate karsten
BridgeDB Backend Service Python None kaner, nickm
TorFlow Backend Service Python Moderate mikeperry
*TorBEL Backend Service Python None Sebastian
* Project is still in an alpha state.

Tor (code, bug tracker)

Central project, providing the core software for using and participating in the Tor network. Numerous people contribute to the project to varying extents, but the chief architects are Nick Mathewson and Roger Dingledine.

Project Ideas:
Improving Tor's ability to resist censorship
Integrating Tor with user-space transport protocol libraries

JTor (code, bug tracker)

Java implementation of Tor and successor to OnionCoffee. This project isn't yet complete, and has been inactive since Fall 2010.

Tor Browser Bundle (code, bug tracker)

The Tor Browser Bundle is an easy-to-use portable package of Tor, Vidalia, and Firefox preconfigured to work together out of the box. This is actively being worked on by Erinn Clark.

Project Ideas:
Audit Tor Browser Bundles for data leaks
Usability testing of Tor

The Amnesic Incognito Live System (code, bug tracker)

The Amnesic Incognito Live System is a live CD/USB distribution preconfigured so that everything is safely routed through Tor and leaves no trace on the local system. This is a merger of the Amnesia and Incognito projects, and still under very active development.

Project Ideas:
Petname system for Tor hidden services
Tails server: Self-hosted services behind Tails-powered Tor hidden services

Torsocks (code, bug tracker)

Utility for adapting other applications to work with Tor. Development has slowed and compatibility issues remain with some platforms, but it's otherwise feature complete.

Torouter (bug tracker)

Project to provide an easy-to-use, embedded Tor instance for routers. This had high activity in late 2010, but has since been rather quiet.

Vidalia (code, bug tracker)

The most commonly used user interface for Tor. Matt Edman started the project in 2006 and brought it to its current stable state. Development slowed for several years, though Tomás Touceda has since taken a lead with pushing the project forward.

Project Ideas:
Tor Controller Status Event Interface for Vidalia
Torrc plugin and improved hidden service configuration panel

Arm (code, bug tracker)

Command-line monitor for Tor. This has been under very active development by its author, Damian Johnson, since early 2009 to make it a better general-purpose controller for *nix environments.

Orbot (code, bug tracker)

Provides Tor on the Android platform. This was under very active development up through Fall 2010, after which things have been quiet.

Torbutton (code, bug tracker)

Firefox addon that addresses many of the client-side threats to browsing the Internet anonymously. Mike has since continued to adapt it to new threats, updated versions of Firefox, and possibly Chrome as well.

Project Ideas:
Torbutton equivalent for Thunderbird

Obfsproxy (code, bug tracker)

A proxy that shapes Tor traffic, making it harder for censors to detect and block Tor.

Project Ideas:
New and innovative pluggable transports
Defensive bridge active scanning measures
Fuzzer for the Tor protocol

Thandy (code)

Updater for Tor. The project began in the Summer of 2008 but wasn't completed. Recently interest in it has been rekindled and many aspects of its design (including the language it'll be in) are currently in flux.

Ooni Probe (code, bug tracker)

Censorship scanner, checking your local connection for blocked or modified content.

Shadow (code, bug tracker)

Shadow is a discrete-event network simulator that runs the real Tor software as a plug-in. Shadow is open-source software that enables accurate, efficient, controlled, and repeatable Tor experimentation.

TorCtl (code, bug tracker)

Python bindings and utilities for using the Tor control port. It has been stable for several years, with only minor revisions.

Stem (code, bug tracker)

Python controller library with a similar scope to TorCtl, but with better testing, documentation, and API. This project is not yet feature complete.

Project Ideas:
Stem PathSupport Capabilities

Metrics (code: db, utils, web, bug tracker)

Processing and analytics of consensus data, provided to users via the metrics portal. This has been under active development for several years by Karsten Loesing.

Project Ideas:
Searchable Tor descriptor and Metrics data archive (Python/Django?)

Atlas (code)

Atlas is a web application to discover Tor relays and bridges. It provides useful information on how relays are configured along with graphics about their past usage. This is the third evolution of the TorStatus application.

TorStatus (code)

Portal providing an overview of the Tor network, and details on any of its current relays. Though very actively used, this project has been unmaintained for a long while. The original codebase was written in PHP, and students from Wesleyan wrote the new Django counterpart.

Weather (code, bug tracker)

Provides automatic notification to subscribed relay operators when their relay's unreachable. This underwent a rewrite by the Wesleyan HFOSS team, which went live in early 2011.

GetTor (code, bug tracker)

E-mail autoresponder providing Tor's packages over SMTP. This has been relatively unchanged for quite a while.

TorCheck (code, bug tracker)

Provides a simple site for determining if the visitor is using Tor or not. This has been relatively unchanged for quite a while.

Onionoo (code)

Onionoo is a JSON based protocol to learn information about currently running Tor relays and bridges.

BridgeDB (code, bug tracker)

Backend bridge distributor, handling the various pools they're distributed in. This was actively developed until Fall of 2010.

TorFlow (code, bug tracker)

Library and collection of services for actively monitoring the Tor network. These include the Bandwidth Scanners (measuring throughput of relays) and SoaT (scans for malicious or misconfigured exit nodes). SoaT was last actively developed in the Summer of 2010, and the Bandwidth Scanners a few months later. Both have been under active use since then, but development has stopped.

TorBEL (code, bug tracker)

The Tor Bulk Exitlist provides a method of identifying if IPs belong to exit nodes or not. This is a replacement for TorDNSEL which is a stable (though unmaintained) Haskell application for this purpose. The initial version of TorBEL was started in GSOC 2010 but since then the project has been inactive.

Project Ideas

You may find some of these projects to be good Google Summer of Code ideas. We have labelled each idea with how useful it would be to the overall Tor project (priority), how much work we expect it would be (effort level), how much clue you should start with (skill level), and which of our core developers would be good mentors. If one or more of these ideas looks promising to you, please contact us to discuss your plans rather than sending blind applications. You may also want to propose your own project idea — which often results in the best applications.

  1. Audit Tor Browser Bundles for data leaks
    Priority: High
    Effort Level: High
    Skill Level: Medium
    Likely Mentors: Jacob

    The Tor Browser Bundle incorporates Tor, Firefox, Polipo, and the Vidalia user interface (and optionally the Pidgin Instant Messaging client). Components are pre-configured to operate in a secure way, and it has very few dependencies on the installed operating system. It has therefore become one of the most easy to use, and popular, ways to use Tor on Windows.

    This project is to identify all of the traces left behind by using a Tor Browser Bundle on Windows, Mac OS X, or Linux. Developing ways to stop, counter, or remove these traces is a final step.

    Students should be familiar with operating system analysis, application development on one or preferably all of Windows, Linux, and Mac OS X, and be comfortable with C/C++ and shell scripting.

    Since the core of this project is researching and auditing TBB this is not likely to be a good GSoC project.

  2. Develop a fully automatic firewall-probing system
    Priority: High
    Effort Level: Medium to High
    Skill Level: High
    Likely Mentors: Robert Ransom, Jacob

    We would like to have a fully automatic firewall-probing system for blocking systems with no long-term state (i.e. firewalls that can examine each connection, but do not change their behaviour for future connections based on the traffic they have seen).

    Ideally, volunteers would only need to set up one or more test servers, and run the probe client program on a publicly accessible computer behind the firewall.

    The test tool should:

    • generate packet captures on both ends (and send them out to the extent possible),
    • cycle through all the SSL configurations we might want to test through a censorship device, and
    • also test some other protocols to see whether they are allowed through the firewall (IMAP and other mail protocols, BitTorrent, DTLS, etc.).
  3. New and innovative pluggable transports
    Priority: High
    Effort Level: High
    Skill Level: High
    Likely Mentors: asn, Steven

    Not-very-smart transports like ROT13 and base64 are nice but not super interesting. Other ideas like bittorrent transports might be relevant, but you will have to provide security proofs on why they are harder to detect and block than other less-sophisticated transports.

    The whole point of this project, though, is to come up with new transports that we haven't already thought of. Be creative.

    Bonus points if your idea is interesting and still implementable through the summer period.

    More bonus points if it's implemented on top of obfsproxy, or if your implementation has a pluggable transport interface on top of it (as specified here).

  4. Defensive bridge active scanning measures
    Priority: High
    Effort Level: High
    Skill Level: High
    Likely Mentors: asn

    Involves providing good answers to this thread as well as concrete implementation plans for it.

    This also involves implementing proposals 189 and 190.

  5. Stem PathSupport Capabilities
    Priority: High
    Effort Level: High
    Skill Level: Medium
    Likely Mentors: Damian (atagar)

    Stem is a python controller library for tor. Like it's predecessor, TorCtl, it uses tor's control protocol to help developers program against the tor process, enabling them to build things similar to Vidalia and arm.

    While TorCtl provided a fine first draft for this sort of functionality, it has not proved to be extensible nor maintainable. Stem is a rewrite of TorCtl with a heavy focus on testing, documentation, and providing a developer friendly API.

    At the moment stem is still very much incomplete, missing several pieces of functionality that TorCtl provides. This is a project to fix that by porting TorCtl's PathSupport module to stem, writing tests for it, and migrate a couple clients to use it.

    PathSupport provides applications with programmatic control over how tor's circuits are built, for instance letting you exit from particular relays. This is used by projects like TorBEL, the Bandwidth Scanners, and SoaT.

    This project can be broken into three parts...

    1. Look at PathSupport's clients to figure out how it is used and come up with the API that we will use for stem. Note that the goal if this project is not to simply copy PathSupport, but to make it better. This task would ideally be done as part of writing the GSoC application.

    2. Implement the PathSupport counterpart for stem. This should be done in an incremental fashion, writing the feature, tests, and going through a code review before moving on. I'll be pretty anal about making it as good as we can during these code reviews so plan for this to take a while. ;)

    3. The real test of the API that you've developed will come when we use it in some real applications. Try to migrate a TorCtl client or two to stem, filling in functionality that we're missing and improving our API as we discover issues. A particularly good client to start with would be TorBEL.

    Upon reflection this is not an especially good project for this year's GSoC. You are still perfectly wecome to apply for this project, but other stem related tasks such as implementing a general controller, descriptor fetching, and client migrations would be better. For the discussion that lead to this see this thread.

  6. Integrating Tor with user-space transport protocol libraries
    Priority: Medium to High
    Effort Level: High
    Skill Level: High
    Likely Mentors: Steven

    Tor currently sends data over TCP links between nodes. Prior research has indicated that this may not be optimal, and instead the role that TCP plays (congestion control and reliability) should be moved into Tor itself. This would allow a number of desirable changes, such as preventing errors on one circuit delaying another, and giving Tor control and visibility of congestion control.

    There are many ways to do this, each with their own tradeoffs and difficulty of implementation. This project will be to select one (or more) option and implement it in Tor. The primary goal will be to test this modified version of Tor in simulation, but if it turns out to work well, it could be deployed in the live Tor network.

    Excellent C programming skills are needed, and knowledge of Tor internals are highly desirable.

  7. Improving Tor's ability to resist censorship
    Priority: Medium to High
    Effort Level: Medium to High
    Skill Level: High
    Likely Mentors: Jake, Thomas

    The Tor 0.2.1.x series makes significant improvements in resisting national and organizational censorship. But Tor still needs better mechanisms for some parts of its anti-censorship design.

    One huge category of work is adding features to our BridgeDB service (Python). Tor aims to give out bridge relay addresses to users that can't reach the Tor network directly, but there's an arms race between algorithms for distributing addresses and algorithms for gathering and blocking them. See our blog post on the topic as an overview, and then look at Roger's or-dev post from December 2009 for more recent thoughts — lots of design work remains.

    If you want to get more into the guts of Tor itself (C), a more minor problem we should address is that current Tors can only listen on a single address/port combination at a time. There's a proposal to address this limitation and allow clients to connect to any given Tor on multiple addresses and ports, but it needs more work.

    This project could involve a lot of research and design. One of the big challenges will be identifying and crafting approaches that can still resist an adversary even after the adversary knows the design, and then trading off censorship resistance with usability and robustness.

  8. Petname system for Tor hidden services
    Priority: Medium
    Effort Level: High
    Skill Level: High
    Likely Mentors: ague

    Tor provides hidden services. These services are only reachable through Tor itself, and provide greater anonymity both for the providers of the service and for its users.

    One current downside of Tor hidden services is that they are addressed using 80-bit base32-encoded addresses such as "v2cbb2l4lsnpio4q.onion". These addresses are hard to remember; this makes them hard to use within amnesic environment like Tails.

    The project is to implement a petname system for Tor hidden services: a way for users or providers of Tor hidden services to add a simple 'nickname' to a central database. Users could then query this central database to retrieve a full hidden service address by giving a nickname.

    Adding petnames to the database could be done using a web interface or automated fetch like those described in the ".onion nym system" proposal.

    Querying the database could be done using a web interface, a REST API and a DNS interface.

    In order not to grow indefinitely, the software should make regular tests to see if hidden services are still reachable and, depending on the last time a nickname was accessed, cleanup the database as necessary.

    The software should allow a distributed, fault-tolerant setup. All nodes should have a synchronized copy of the database, should be ready to answer queries and should coordinate the tests for hidden service availability.

    The resulting codebase must be easy to deploy: it should not be hard to setup new databases.

    It is expected that the volunteer will be using Behaviour Driven Development methods. Either in Ruby using Cucumber and RSpec, or in Python using similar tools.

  9. Tails server: Self-hosted services behind Tails-powered Tor hidden services
    Priority: Medium
    Effort Level: High
    Skill Level: Medium, but wide-scoped
    Likely Mentors: intrigeri, anonym

    Let's talk about group collaboration, communication and data sharing infrastructure, such as chat servers, wikis, or file repositories.

    Hosting such data and infrastructure in the cloud generally implies to trust the service providers not to disclose content, usage or users location information to third-parties. Hence, there are many threat models in which cloud hosting is not suitable.

    Tor partly answers the users location part; this is great, but content is left unprotected.

    There are two main ways to protect such content: either to encrypt it client-side (security by design), or to avoid putting it into untrusted hands in the first place.

    Cloud solutions that offer security by design are rare and generally not mature yet. The Tails server project is about exploring the other side of the alternative: avoiding to put private data into untrusted hands in the first place.

    This is made possible thanks to Tor hidden services, that allow users to offer location-hidden services, and make self-hosting possible in many threat models. Self-hosting has its own lot of problems, however, particularly in contexts where the physical security of the hosting place is not assured. Combining Tor hidden services with Tails' amnesia property and limited support for persistent encrypted data allows to protect content, to a great degree, even in such contexts.

    In short, setting up a new Tails server would be done by:

    1. Alice plugs a USB stick into a running desktop Tails system.
    2. Alice uses a GUI to easily configure the needed services.
    3. Alice unplugs the USB stick, that now contains encrypted services configuration and data storage space.
    4. Alice plugs that USB stick (and possibly a Tails Live CD) into the old laptop that was dedicated to run Tails server.
    5. Once booted, Alice enters the encryption passphrase either directly using the keyboard or through a web interface listening on the local network.
    6. Then, Bob can use the configured services once he gets a hold on the hidden service address. (The petname system for Tor hidden services project would be very complementary to this one, by the way.)

    Tails server should content itself with hardware that is a bit old (such as a PIII-450 laptop with 256MB of RAM) and/or half broken (e.g. non-functional hard-disk, screen or keyboard).

    The challenges behind this project are:

    • Design and write the services configuration GUI [keywords: edit configuration files, upgrade between major Debian versions, debconf].
    • How to create the hidden service key? [keywords: Vidalia, control protocol].
    • Adapt the Tails boot process to allow switching to "server mode" when appropriate.
    • Add support, to the Tails persistence setup process, for asking an encryption passphrase without X, and possibly with a broken keyboard and/or screen [keywords: local network, SSL/TLS?, certificate?].

    This project can easily grow quite large, so the first task would probably be to clarify what it would need to get an initial (minimal but working) implementation ready to be shipped to users.

    This project does not require to be an expert in one specific field, but it requires to be experienced and at ease with a large scope of software development tools, processes, and operating system knowledge.

    Undertaking this project requires in-depth knowledge of Debian-like systems (self-test: do the "dpkg conffile" and "debconf preseeding" words sound new to your ear?); the Debian Live persistence system being written in shell, being at ease with robust shell scripting is a must; to end with, at least two pieces of software need to be written from scratch (a GUI and a webapp): the preferred languages for these tasks would be Python and Perl. Using Behaviour Driven Development methods to convey expectations and acceptance criteria would be most welcome.

    For more information see https://tails.boum.org/todo/server_edition/

  10. Improve our GeoIP file format
    Priority: Medium
    Effort Level: Medium
    Skill Level: Medium to High
    Likely Mentors: Robert Ransom

    Currently, Tor bridges and relays read an entire IP->country database into memory from a text file during startup. We would like to distribute this database and store it on disk in a much more compact form, and perform IP->country lookups on it in its on-disk format if possible.

    We have a sketch of a design for a moderately optimized format for IPv4 GeoIP data; this project will involve both implementing the IPv4 format and designing and implementing a format for IPv6 GeoIP data.

    Since the core of this project is researching IPv6 GeoIP data and designing the IPv6 format, this is not likely to be a good GSoC project.

  11. Torrc plugin and improved hidden service configuration panel
    Priority: Medium
    Effort Level: Medium
    Skill Level: Medium
    Likely Mentors: Tomás

    Vidalia's configuration handling has changed in the alpha branch. Now every Tor option is saved in the torrc file. With that change, the Hidden Service configuration panel was removed due to its specificity and its multiple bugs.

    The idea would be to provide the new Torrc class' functionality to the Plugin Engine and with that, create a better Hidden Service configuration panel as a plugin.

    A person undertaking this project should have good UI design, layout skills and some C++ development experience. Previous experience with Qt and Qt's Designer will be very helpful, but are not required. Javascript knowledge is a plus, but it shouldn't be a problem if the person complies with the previous requirements.

  12. Searchable Tor descriptor and Metrics data archive
    Priority: Medium
    Effort Level: Medium
    Skill Level: Medium
    Likely Mentors: Karsten

    The Metrics data archive of Tor relay descriptors and other Tor-related network data has grown to over 100G in size, bz2-compressed. We have developed two search interfaces: the relay search finds relays by nickname, fingerprint, or IP address in a given month; ExoneraTor finds whether a given IP address was a relay on a given day.

    We'd like to have a more general search application for Tor descriptors and metrics data. There are more descriptor types that we'd like to include in the search. The search application should handle most of them and understand some semantics like what's a timestamp, what's an IP address, and what's a link to another descriptor. Users should then be able to search for arbitrary strings or limit their search to given time periods or IP address ranges. Descriptors that reference other descriptors should contain links, and descriptors should be able to say from where they are linked. The goal is to make the archive easily browsable.

    The search application shall be separate from the metrics website and shouldn't rely on the metrics website codebase. The search application will contain hourly updated descriptor data from the metrics website via rsync. Programming language and database system are not specified yet, though there's a slight preference for Python/Django and Postgres for maintenance reasons. If there are good reasons to pick something else, e.g, some NoSQL variant or some search application framework, that's fine, too. Further requirements are that lookups should be really fast and that changes to the search application can be implemented in reasonable time.

    Applications for this project should come with a design of the proposed search application, ideally with a proof-of-concept based on a subset of the available data to show that it will be able to handle the 100G+ of data.

  13. Torbutton equivalent for Thunderbird
    Priority: Medium
    Effort Level: High
    Skill Level: High
    Likely Mentors: Jake

    We're hearing from an increasing number of users that they want to use Thunderbird with Tor. However, there are plenty of application-level concerns, for example, by default Thunderbird will put your hostname in the outgoing mail that it sends. At some point we should start a new push to build a Thunderbird extension similar to Torbutton.

  14. Usability testing of Tor
    Priority: Medium
    Effort Level: Medium
    Skill Level: Low to Medium
    Likely Mentors: Andrew

    Especially the browser bundle, ideally amongst our target demographic. That would help a lot in knowing what needs to be done in terms of bug fixes or new features. We get this informally at the moment, but a more structured process would be better.

    Please note that since this isn't a coding project, it isn't suitable for Google Summer of Code.

  15. Tor Controller Status Event Interface for Vidalia
    Priority: Medium
    Effort Level: Medium
    Skill Level: Low to Medium
    Likely Mentors: Tomás

    There are a number of status changes inside Tor of which the user may need to be informed. For example, if the user is trying to set up his Tor as a relay and Tor decides that its ports are not reachable from outside the user's network, we should alert the user. Currently, all the user gets is a couple of log messages in Vidalia's 'message log' window, which they likely never see since they don't receive a notification that something has gone wrong. Even if the user does actually look at the message log, most of the messages make little sense to the novice user.

    Tor has the ability to inform Vidalia of many such status changes, and we recently implemented support for a couple of these events. Still, there are many more status events which the user should be informed of, and we need a better UI for actually displaying them to the user.

    The goal of this project then is to design and implement a UI for displaying Tor status events to the user. For example, we might put a little badge on Vidalia's tray icon that alerts the user to new status events they should look at. Double-clicking the icon could bring up a dialog that summarizes recent status events in simple terms and maybe suggests a remedy for any negative events if they can be corrected by the user. Of course, this is just an example and one is free to suggest another approach.

    A person undertaking this project should have good UI design and layout skills and some C++ development experience. Previous experience with Qt and Qt's Designer will be very helpful, but are not required. Some English writing ability will also be useful, since this project will likely involve writing small amounts of help documentation that should be understandable by non-technical users. Bonus points for some graphic design/Photoshop fu, since we might want/need some shiny new icons too.

  16. Fuzzer for the Tor protocol
    Priority: Low to Medium
    Effort Level: Medium to High
    Skill Level: High
    Likely Mentors: asn

    Involves researching good and smart ways to fuzz stateful network protocols, and also implementing the fuzzer.

    We are mostly looking for a fuzzer that fuzzes the Tor protocol itself, and not the Tor directory protocol.

    Bonus points if it's extremely modular. Relevant research:

    • PROTOS - Security Testing of Protocol Implementations
    • INTERSTATE: A Stateful Protocol Fuzzer for SIP
    • Detecting Communication Protocol Security Flaws by Formal Fuzz Testing and Machine Learning
    • SNOOZE: Toward a Stateful NetwOrk prOtocol fuzZE
    • Michal Zalewski's "bugger"
    • Also look at the concepts of "model checking" and "symbolic execution" to get inspired.
  17. APAF: Anonymous Python Application Framework
    Priority: Medium
    Effort Level: Medium
    Skill Level: Medium
    Likely Mentors: Arturo (hellais)

    The goal of APAF is to create a build framework for creating a binary package for multiple platforms (.app/dmg, .exe, .deb, etc.) that includes the python interpreter (cpython), the Tor binary and all the UI necessary to make users be able to easily run the bundled Tor Hidden Service.

    For GSoC the student is expected to create the build system capable of building a simple web application that serves static files. It should also include a web UI for a wizard setup, checking the status of the HS and configuring it.

    For more details on this: check out https://pad.riseup.net/p/1zA8FI4nrYlq

  18. Bring up new ideas!
    Don't like any of these? Look at the Tor development roadmap for more ideas, or just try out Tor, Vidalia, and Torbutton, and find out what you think needs fixing. Some of the current proposals might also be short on developers.

Other Coding and Design related ideas

  1. Tor relays don't work well on Windows XP. On Windows, Tor uses the standard select() system call, which uses space in the non-page pool. This means that a medium sized Tor relay will empty the non-page pool, causing havoc and system crashes. We should probably be using overlapped IO instead. One solution would be to teach libevent how to use overlapped IO rather than select() on Windows, and then adapt Tor to the new libevent interface. Christian King made a good start on this in the summer of 2007.
  2. We need to actually start building our blocking-resistance design. This involves fleshing out the design, modifying many different pieces of Tor, adapting Vidalia so it supports the new features, and planning for deployment.
  3. We need a flexible simulator framework for studying end-to-end traffic confirmation attacks. Many researchers have whipped up ad hoc simulators to support their intuition either that the attacks work really well or that some defense works great. Can we build a simulator that's clearly documented and open enough that everybody knows it's giving a reasonable answer? This will spur a lot of new research. See the entry below on confirmation attacks for details on the research side of this task — who knows, when it's done maybe you can help write a paper or three also.
  4. Tor 0.1.1.x and later include support for hardware crypto accelerators via OpenSSL. It has been lightly tested and is possibly very buggy. We're looking for more rigorous testing, performance analysis, and optimally, code fixes to OpenSSL and Tor if needed.
  5. Perform a security analysis of Tor with "fuzz". Determine if there are good fuzzing libraries out there for what we want. Win fame by getting credit when we put out a new release because of you!
  6. Tor uses TCP for transport and TLS for link encryption. This is nice and simple, but it means all cells on a link are delayed when a single packet gets dropped, and it means we can only reasonably support TCP streams. We have a list of reasons why we haven't shifted to UDP transport, but it would be great to see that list get shorter. We also have a proposed specification for Tor and UDP — please let us know what's wrong with it.
  7. We're not that far from having IPv6 support for destination addresses (at exit nodes). If you care strongly about IPv6, that's probably the first place to start.
  8. We need a way to generate the website diagrams (for example, the "How Tor Works" pictures on the overview page from source, so we can translate them as UTF-8 text rather than edit them by hand with Gimp. We might want to integrate this as an wml file so translations are easy and images are generated in multiple languages whenever we build the website.
  9. How can we make the various LiveCD/USB systems easier to maintain, improve, and document? One example is The Amnesic Incognito Live System.
  10. Another anti-censorship project is to try to make Tor more scanning-resistant. Right now, an adversary can identify Tor bridges just by trying to connect to them, following the Tor protocol, and seeing if they respond. To solve this, bridges could act like webservers (HTTP or HTTPS) when contacted by port-scanning tools, and not act like bridges until the user provides a bridge-specific key. To start, check out Shane Pope's thesis and prototype.

Research

  1. The "end-to-end traffic confirmation attack": by watching traffic at Alice and at Bob, we can compare traffic signatures and become convinced that we're watching the same stream. So far Tor accepts this as a fact of life and assumes this attack is trivial in all cases. First of all, is that actually true? How much traffic of what sort of distribution is needed before the adversary is confident he has won? Are there scenarios (e.g. not transmitting much) that slow down the attack? Do some traffic padding or traffic shaping schemes work better than others?
  2. A related question is: Does running a relay/bridge provide additional protection against these timing attacks? Can an external adversary that can't see inside TLS links still recognize individual streams reliably? Does the amount of traffic carried degrade this ability any? What if the client-relay deliberately delayed upstream relayed traffic to create a queue that could be used to mimic timings of client downstream traffic to make it look like it was also relayed? This same queue could also be used for masking timings in client upstream traffic with the techniques from adaptive padding, but without the need for additional traffic. Would such an interleaving of client upstream traffic obscure timings for external adversaries? Would the strategies need to be adjusted for asymmetric links? For example, on asymmetric links, is it actually possible to differentiate client traffic from natural bursts due to their asymmetric capacity? Or is it easier than symmetric links for some other reason?
  3. Repeat Murdoch and Danezis's attack from Oakland 05 on the current Tor network. See if you can learn why it works well on some nodes and not well on others. (My theory is that the fast nodes with spare capacity resist the attack better.) If that's true, then experiment with the RelayBandwidthRate and RelayBandwidthBurst options to run a relay that is used as a client while relaying the attacker's traffic: as we crank down the RelayBandwidthRate, does the attack get harder? What's the right ratio of RelayBandwidthRate to actually capacity? Or is it a ratio at all? While we're at it, does a much larger set of candidate relays increase the false positive rate or other complexity for the attack? (The Tor network is now almost two orders of magnitude larger than it was when they wrote their paper.) Be sure to read Don't Clog the Queue too.
  4. The "routing zones attack": most of the literature thinks of the network path between Alice and her entry node (and between the exit node and Bob) as a single link on some graph. In practice, though, the path traverses many autonomous systems (ASes), and it's not uncommon that the same AS appears on both the entry path and the exit path. Unfortunately, to accurately predict whether a given Alice, entry, exit, Bob quad will be dangerous, we need to download an entire Internet routing zone and perform expensive operations on it. Are there practical approximations, such as avoiding IP addresses in the same /8 network?
  5. Other research questions regarding geographic diversity consider the tradeoff between choosing an efficient circuit and choosing a random circuit. Look at Stephen Rollyson's position paper on how to discard particularly slow choices without hurting anonymity "too much". This line of reasoning needs more work and more thinking, but it looks very promising.
  6. Tor doesn't work very well when relays have asymmetric bandwidth (e.g. cable or DSL). Because Tor has separate TCP connections between each hop, if the incoming bytes are arriving just fine and the outgoing bytes are all getting dropped on the floor, the TCP push-back mechanisms don't really transmit this information back to the incoming streams. Perhaps Tor should detect when it's dropping a lot of outgoing packets, and rate-limit incoming streams to regulate this itself? I can imagine a build-up and drop-off scheme where we pick a conservative rate-limit, slowly increase it until we get lost packets, back off, repeat. We need somebody who's good with networks to simulate this and help design solutions; and/or we need to understand the extent of the performance degradation, and use this as motivation to reconsider UDP transport.
  7. A related topic is congestion control. Is our current design sufficient once we have heavy use? Maybe we should experiment with variable-sized windows rather than fixed-size windows? That seemed to go well in an ssh throughput experiment. We'll need to measure and tweak, and maybe overhaul if the results are good.
  8. Our censorship-resistance goals include preventing an attacker who's looking at Tor traffic on the wire from distinguishing it from normal SSL traffic. Obviously we can't achieve perfect steganography and still remain usable, but for a first step we'd like to block any attacks that can win by observing only a few packets. One of the remaining attacks we haven't examined much is that Tor cells are 512 bytes, so the traffic on the wire may well be a multiple of 512 bytes. How much does the batching and overhead in TLS records blur this on the wire? Do different buffer flushing strategies in Tor affect this? Could a bit of padding help a lot, or is this an attack we must accept?
  9. Tor circuits are built one hop at a time, so in theory we have the ability to make some streams exit from the second hop, some from the third, and so on. This seems nice because it breaks up the set of exiting streams that a given relay can see. But if we want each stream to be safe, the "shortest" path should be at least 3 hops long by our current logic, so the rest will be even longer. We need to examine this performance / security tradeoff.
  10. It's not that hard to DoS Tor relays or directory authorities. Are client puzzles the right answer? What other practical approaches are there? Bonus if they're backward-compatible with the current Tor protocol.
  11. Programs like Torbutton aim to hide your browser's UserAgent string by replacing it with a uniform answer for every Tor user. That way the attacker can't splinter Tor's anonymity set by looking at that header. It tries to pick a string that is commonly used by non-Tor users too, so it doesn't stand out. Question one: how badly do we hurt ourselves by periodically updating the version of Firefox that Torbutton claims to be? If we update it too often, we splinter the anonymity sets ourselves. If we don't update it often enough, then all the Tor users stand out because they claim to be running a quite old version of Firefox. The answer here probably depends on the Firefox versions seen in the wild. Question two: periodically people ask us to cycle through N UserAgent strings rather than stick with one. Does this approach help, hurt, or not matter? Consider: cookies and recognizing Torbutton users by their rotating UserAgents; malicious websites who only attack certain browsers; and whether the answers to question one impact this answer.
  12. How many bridge relays do you need to know to maintain reachability? We should measure the churn in our bridges. If there is lots of churn, are there ways to keep bridge users more likely to stay connected?

Let us know if you've made progress on any of these!

Tor Tip

Tor is written for and supported by people like you. Donate today!