System Design and Memory Limits

How to Approach:

Don’t be scared by these types of questions. Unless you claim to know how to design large systems, your interviewer probably won’t expect you to know this stuff automatically. They just want to see how you tackle these problems.

General Approach

The general approach is as follows: Imagine we’re designing a hypothetical system X for millions of items (users, files, megabytes, etc):

  1. How would you solve it for a small number of items? Develop an algorithm for this case, which is often pretty straight-forward.

  2. What happens when you try to implement that algorithm with millions of items? It’s likely that you have run out of space on the computer. So, divide up the files across many computers.
    » How do you divide up data across many machines? That is, do the first 100 items appear on the same computer? Or all items with the same hash value mod 100?
    » About how many computers will you need? To estimate this, ask how big each item is and take a guess at how much space a typical computer has.

  3. Now, fix the problems that occur when you are using many computers. Make sure to answer the following questions:
    » How does one machine know which machine it should access to look up data?
    » Can data get out of sync across computers? How do you handle that?
    » How can you minimize expensive reads across computers?

Example: Design a Web Crawler

  1. Forget about the fact that you’re dealing with billions of pages. How would you design this system if it were just a small number of pages? You should have an understanding of how you would solve the simple, small case in order to understand how you would solve the bigger case.

  2. Now, think about the issues that occur with billions of pages. Most likely you can’t fit the data on one machine. How will you divide it up? How will you figure out which computer has a particular piece of data?

  3. You now have different pieces of data on different machines. What problems might that create? Can you try to solve them?

And remember, don’t get scared! This is just an ordinary problem-solving question.