dholm.com
msgbartop
I need this baby in a month send me nine women!
msgbarbottom

03 Dec 07 High availability network solutions

A while back I was tasked with prototyping a system for transferring large amounts of data across the Internet to a wide array of nodes without making any assumptions about how they were connected. This took some research on my part as I hadn’t really designed anything network-wise which was to hold up under extreme load or service a huge amount of simultaneous connections. During the investigative period I found a couple of links which I found to be particularly well written which I would now like to share with you.

The first one is “High-Performance Server Architecture” by Jeff Darcy. This is a good introduction into the subject and mainly covers how to manage resources. It will help you avoid the most common mistakes.

After that we have “The C10K problem” by Dan Kegel. This article digs a little deeper and offers many recommendations on how to manage the problem of handling tens of thousands of requests by leveraging existing solutions present in many of the largest *NIX operating systems. This is a typical don’t reinvent the wheel scenario where the OS already has several solutions canned and ready for you as long as you know where to look.

Finally I consulted CiteSeer and found a couple of really good articles on a bit more scientific level which handed me the last pieces of the puzzle. As I can’t divulge too much about our system in particular I’m going to leave the more specific articles out of this blog post.

To top it all off I want to share this excellent but unrelated link to “Capturing that Special Moment“.

Similar Posts:



Reader's Comments

  1. |

    What is “large amounts of data” here? There are some interesting proprietary solutions to handle data distribution to many nodes at once. Tibco’s Rendezvous system springs to mind.

    In Rz a consumer process subscribes to a named channel and receives messages on this channel. A sender doesn’t need to know how many consumers there is. The infrastructure can guarantee delivery, if you want, and can use such techniques as multicast, if available.

    Unfortunately, Rz costs a lot of money. It would be interesting to see a free software alternative, but I’m not aware of any off the top of my head.

    Anyway, if you have control of the network (or can put some pressure on your provider) you might want consider using multicast and some sort of reliable protocol (NAK based, perhaps?) on top of it.

  2. |

    We are talking about hundreds of gigabytes which have to be transferred as quickly as possible. Multicast is not an option or at least not an option we can assume that we will be able to use.
    The prototype was a NAK-based system running over UDP with some nifty features added.



Leave a Comment