{"id":47,"date":"2017-09-21T00:52:30","date_gmt":"2017-09-21T00:52:30","guid":{"rendered":"http:\/\/blogs.harvard.edu\/profsmith\/?p=47"},"modified":"2017-09-21T00:52:30","modified_gmt":"2017-09-21T00:52:30","slug":"scattered-or-great-design-thinking","status":"publish","type":"post","link":"https:\/\/archive.blogs.harvard.edu\/profsmith\/2017\/09\/21\/scattered-or-great-design-thinking\/","title":{"rendered":"Scattered, or Great Design Thinking?"},"content":{"rendered":"<p>Jim and I talked about a range of design topics and a bit about how today\u2019s Internet works in the seminar this week, and I thought I\u2019d write my blog in the same \u201cscattered\u201d fashion. While at first seemingly scattered and long, if you bear with me, we might get to the second half of this blog\u2019s title. Ready?<\/p>\n<p>I\u2019ll start with a pointer to a <a href=\"https:\/\/www.wired.com\/story\/what-is-dns-hijacking\/\">Wired article about DNS hijacking<\/a>. On Monday, we discussed, at a high level, how the Domain Name System (DNS) on the Internet helps people to get to web sites (or other resources on the Internet) by converting easy to memorize domain names (e.g., college.harvard.edu) to the numerical IP addresses (i.e., 184.73.172.23 in my example). The IP address is simply a standardized name space that works like U.S. postal addresses. For example, the address of <a href=\"http:\/\/www.pinocchiospizza.net\/\">Pinocchio\u2019s Pizza &amp; Subs<\/a> shop is 74 Winthrop St., Cambridge MA. You\u2019d find it by traveling to Massachusetts and then to the Massachusetts\u2019s town of Cambridge. In Cambridge, you\u2019d look for Winthrop Street and hope you\u2019d find a building with the number 74 on it. IP addresses work the same way. The first numbers take you to a particular part of the network, and the last number corresponds to some particular machine (house) on the identified subnet (street). The Wired article talks about how adversaries have attacked this system to cut certain resources out of the Internet for a short time.<\/p>\n<p>Unfortunately, adversaries are tougher to defeat than Mother Nature. So let\u2019s stick with getting around the problems that Mother Nature throws at us for the rest of this blog post.<\/p>\n<p>While Jim and I were repeatedly expressing our deep admiration for the <a href=\"http:\/\/dl.acm.org\/citation.cfm?id=357402\">Saltzer et al paper<\/a> this week, we somehow stumbled into a discussion of bit-error detection and correction. I\u2019m worried that I glossed over things too quickly. So, here\u2019s a bit more explanation of how this stuff works. There\u2019s a whole science behind this, but I\u2019ll attempt to just give you a flavor of how it works and why it is powerful.<\/p>\n<p>As we discussed, you can detect a single bit error by adding a parity bit to any group of bits. For example, if I wanted to protect a 4-bit data string against single bit errors while the packet was in transmission, I would add a single bit to the data string (creating a 5-bit packet), and the value of this fifth bit would be the sum of the other four bits modulo 2. As described, I\u2019ve added what\u2019s called an even parity bit to my packet. It\u2019s called this because the addition of the parity bit makes the number of ones in the 5-bit packet even (or zero). Lots of things in the digital world have duals, and the same is true here for we could also create an odd parity bit that would make the number of ones in the packet odd. Let\u2019s stick with even parity.<\/p>\n<p>How does this work in practice? The machine sending the original 4-bit data string computes the even parity bit, appends it to the end of the four bits of data, and sends a 5-bit packet. The receiving machine grabs the 5-bit packet and computes the packet\u2019s parity (over all 5 bits). If the result of this computation is zero, the first 4 bits are the data it wanted without any single-bit errors. If the result is one, then there was an error in transmission and the packet needs to be resent.<\/p>\n<p>Notice that we can\u2019t correct the error without retransmission because we don\u2019t know where the error occurred in the 5 bits we received. One of the 1s was flipped to a 0 or a 0 was flipped to a 1, but all we know is that a bit was flipped somewhere.<\/p>\n<p>Of course, this scheme isn\u2019t guaranteed to work for 2 bit errors. If you flip any two bits, the effect of their flips cancels each other out in the parity calculation. I\u2019ll let you figure out what happens when you have more than 2 bit errors.<\/p>\n<p>We, however, were interested in fixing (Mother-Nature-type) errors in the network so that you could create a reliable network and remove the need for retransmissions. To explore this question, let\u2019s assume that your network never has anything more than a single bit error in a packet no matter what its size (an unrealistic, but pedagogically interesting situation). What would I have to do to guarantee that the receiving machine had enough information to correct any single bit error in the packet?<\/p>\n<p>Well, you could send the same data string twice in a single packet, which guarantees that you would get a clean copy of the data bits. Unfortunately, if the two received copies don\u2019t match, you don\u2019t know which copy of the duplicated data bits is the one you want. You simply made a very space-expensive, single-bit error detector that not only tells you that an error occurred, but where it occurred in the string of data bits.<\/p>\n<p>Ok, that design failed, but fear not. We can always try again. Let\u2019s try sending three copies of the data bits in a single packet. Now in the case of a single bit error in the packet, we can use a simple voting mechanism to decide what is the correct string of data bits. Success! The cost of this success was that we need a packet three-times longer than the data we wanted to send. Hmmm, maybe we can do better.<\/p>\n<p>This is where the work of Richard Hamming comes in. Hamming did pioneering work in the middle of the last century on error detection and how to build efficient error correcting codes. Continuing with our example of wanting to send 4 data bits over a network, Hamming showed that you could add just 3 parity bits to the 4 data bits (creating a 7-bit packet) and correct any single bit error that occurs in the entire 7-bit packet. That\u2019s less space overhead than we added in our first duplication scheme that failed to achieve error correction! More impressive, the overhead of Hamming\u2019s approach decreases very quickly as you send longer and longer strings of data bits. For example, when we send 4 data bits in a 7-bit packet, the efficiency of the transmission is (4\/7 =) 0.571. If we wanted to send 247 data bits, we\u2019d need only 8 parity bits to cover all (247+8 =) 256 bits in the packet for a (247\/256 =) 0.969 efficiency! Nearly every bit sent is a useful data bit! That\u2019s the power of exponentials and clever encodings.<\/p>\n<p>To see exactly how this works, I encourage you to look at the example on the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Hamming(7,4)\">Hamming(7,4) Wikipedia page<\/a>. The matrix arithmetic might look intimidating, but it\u2019s just a mathematical way of saying that the three parity bits looked at as a 3-bit number can cover all 7 locations in our 7-bit packet, allowing us through clever encoding to know exactly which bit was flipped, if one was. If no bit was flipped (i.e., no single bit error occurred), the eighth encoding of the three parity bits means no single bit error occurred.<\/p>\n<p>In class, I said we needed to add only three parity bits to correct any single bit error in eight data bits. Off by one error! You clearly need four parity bits to correct a single bit error in eight data bits (a 12-bit packet). Please fix your notes if you took some.<\/p>\n<p>I want to bring us back to design as I end this blog post. I often cold call on some of you during our seminar asking, for example, \u201cYasmin, what would you do to make this work?\u201d In asking this, I\u2019m asking a design question, and design questions are, at first, hard. Why? Because design questions have no right answer. (Please go back and read the previous line again.) And because they have no right answer, you can answer any way you like! The worst thing that will happen is that you\u2019ll describe something that doesn\u2019t do what we hoped it would do, as I did above in the doubling-the-data-string example. No problem with that. Maybe we\u2019ll design something cool for a different application, or we\u2019ll learn something and restart our design process a bit smarter.<\/p>\n<p>Now, most of you, probably, have never done any significant design. That\u2019s ok. More importantly, in my humble opinion, the only way to learn to do design is to do it. How might you start doing it? (How might you answer my question to Yasmin?) Well, you know where you want to go and so just take a small step in that direction. Suggest something that sounds like it moves us in the desired direction and maybe give the reason why you think it moves us in the right direction. We did that above. With enough practice, you\u2019ll see patterns in design that lead to fundamental advances like Hamming did, and like Saltzer and his colleagues did.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Jim and I talked about a range of design topics and a bit about how today\u2019s Internet works in the seminar this week, and I thought I\u2019d write my blog in the same \u201cscattered\u201d fashion. While at first seemingly scattered and long, if you bear with me, we might get to the second half of [&hellip;]<\/p>\n","protected":false},"author":8112,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-47","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/archive.blogs.harvard.edu\/profsmith\/wp-json\/wp\/v2\/posts\/47","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/archive.blogs.harvard.edu\/profsmith\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/archive.blogs.harvard.edu\/profsmith\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/profsmith\/wp-json\/wp\/v2\/users\/8112"}],"replies":[{"embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/profsmith\/wp-json\/wp\/v2\/comments?post=47"}],"version-history":[{"count":1,"href":"https:\/\/archive.blogs.harvard.edu\/profsmith\/wp-json\/wp\/v2\/posts\/47\/revisions"}],"predecessor-version":[{"id":48,"href":"https:\/\/archive.blogs.harvard.edu\/profsmith\/wp-json\/wp\/v2\/posts\/47\/revisions\/48"}],"wp:attachment":[{"href":"https:\/\/archive.blogs.harvard.edu\/profsmith\/wp-json\/wp\/v2\/media?parent=47"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/profsmith\/wp-json\/wp\/v2\/categories?post=47"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/archive.blogs.harvard.edu\/profsmith\/wp-json\/wp\/v2\/tags?post=47"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}