HTTP-Over-Protocol - Story and lessons

During my time in Germany, a friend once told me that his university’s proxy only lets him use HTTP(S) traffic. I didn’t know much about networks back then to be able to think over this, but when I began my course in Networks, revisiting this problem suggested a solution.

After a few weeks, and around 6 hours of initial code later, I had the project ready, in the form of hop.

HOP is a tool meant to tunnel any sort of traffic over a standard HTTP channel.

Useful for scenarios where there’s a proxy filtering all traffic except standard HTTP(S) traffic. Unlike other tools which either require you to be behind a proxy which let’s you pass arbitrary traffic (possibly after an initial CONNECT request), or tools which work only for SSH, this imposes no such restrictions.

Response

I posted a link to hop on reddit for comments, and surprisingly, I received (github) stars at a never-seen-before rate for me. Some people suggested some existing ways to accomplish this. Personally, I knew there would be some. But while coding this, I had stopped myself from checking for such things, so as to not lower my motivation for finishing this code. This actually helped, and despite some generic methods existing already, hop specializes for arbitrary protocols, and I think it is useful in some ways.

The repository stayed in top 5 trending C++ repositories for 2 days, and went on to cross 100 stars after 3 days. This surely felt good :smile:

Working

This section tries to tackle a situation where you want to use SSH to connect to a remote machine where you don’t have root privileges.

There will be 7 entities:

  1. Client (Your computer, behind the proxy)
  2. Proxy (Evil)
  3. Target Server (The remote machine you want to SSH to, from Client)
  4. Client HOP process
  5. Target HOP process
  6. Client SSH process
  7. Target SSH process

If there was no proxy, the communication would be something like:

Client -> Client SSH process -> Target Server -> Target SSH process

In this scenario, here’s the proposed method:

Client ->
Client SSH process ->
Client HOP process ->
Proxy ->
Target HOP process ->
Target SSH process ->
Target Server

HOP simply wraps all the data in HTTP packets, and buffers them accordingly.

Another even more complicated scenario would be if you have an external utility server, and need to access another server’s resources from behind a proxy. In this case, hop will still run on your external server, but instead of forwarding to localhost from the target HOP process, you should forward to the remote address (hostname) of the target machine which has the SSH process.

Current state

After the initial release (when I started receiving stars on Github), I found a bug which caused data drops, and thus broke the tunneled SSH connections. Around 10 hours later, and multiple rewrites later (which made the code much more readable too), the bug finally stands fixed. The bug was due to me using 2 different threads to read and write to the same 2 sockets. I realized this while writing an article on bidirectional nature of sockets.

There had been plenty of memory corruption errors as well, which have been fixed now. I can use hop to SSH to machines over HTTP, or transfer files using nc reliably now.

Lessons

  • It is often quite hard to figure out parts which can be abstracted out in a program on the first rewrite. I actually leart a lot during the rewrites of the code.
  • You get no prizes for reusing variables in real life. Here I quote a very helpful user on Reddit. This actually helped me clean up my code a lot, and make it much more readable.
  • You can never assume thread safety, even for local buffers, or for logging classes.
  • Stress testing ought to be mandatory before publishing the project :smile:
  • Spend your time working on innovative projects rather than writing config files for packages etc. Helps you learn a lot more.
  • Avoid C/C++ if you can :smile: If you cannot, atleast avoid memory corruption errors.
  • Think before you begin writing. Think of all the corner cases, and all the errors you want to handle.
  • Don’t leave things which you assume will never happen, and thus you don’t handle them. If there’s something like that, put an assert on it. Will save a ton of a time later.
  • Remember, sockets are not synced nicely across threads.