MIT Web Programming Crash Course

Table of Contents

1 Introduction

These are the (work in progress) notes I took while taking MIT's 2019 web.lab, formerly called 6.148 Web Programming Competition. It is an 8 day workshop that assumes no background, and the goal is to produce a prototype. I use a library abstraction in OCaml to successfully build web applications, but have no idea the underlying fundamentals of web development such as CSS except basic flexbox manipulation so decided to take this workshop as a crash course. My learning style is to identify things that I don't understand as I progress, trying secondary sources to elaborate on whatever seems confusing or is hand waved in the slides and lectures, which may be helpful to you and are an example of the kinds of notes you should take yourself when trying to figure out a new subject. We can't possibly learn the entire field of web development in 8 days so this is just a starting point.

1.1 Course materials

  • The 2019 schedule: http://weblab.mit.edu/schedule/ (flip your device to landscape, or view in desktop mode if you can't see lecture/slide links)
    • The 2019 lectures are terrible production quality, slides are available to pair with them. Or try the 2018 schedule, better quality lectures
  • The git repository for the assignments

Optional:

  • MIT's 6.170 Software Studio lectures and slides for exercises and review, which is a more advanced course so optional extra material.
  • The free book, Functional-Light JavaScript by Kyle Simpson to help explain some of the 6.170 material.

1.2 Prerequisites

As stated in the intro this workshop assumes no background, not even a programming background. You could absolutely follow the lectures and just hack together a prototype and it will work.

1.2.1 If you want to learn programming

Try this book, or even better, the whole course for it, all you need is this browser interpreter. Once you get to a level of semi-competence start rewriting some of the basic library functions like member, length, map or fold. Compare your implementation to the official one. Repeat until your implementation starts to converge with the way the language developers have written the implementation. You may even want to start contributing and solving some of the issues or writing minor features. Do enough of these until you can write bigger features. Congrats you are now a developer.

1.3 Req software

I'm doing the majority of exercises with Eruda, on a $100 Motorola phone 1hr or so per day and it's been working so far. You can also use Chrome DevTools or FireFox Developer Tools to edit javascript in the console as you read the slides on any desktop browser or VSCode. This site itself is Termux + Emacs + org-mode and the generated html is then POST to 1mb.site a free host. 1mb also has an online code editor. For the Node workshops you can run this locally or deploy for free to Zeit.

1.4 Discord channel

Some anons have started a Discord channel if you want to collaborate/learn with others.

2 Ideation

  • YouTube - IAP 2019 Day 1 - Ideation

Recommendation: just flip through the slides.

2.1 Step 1: generate everything and everything

  • business model abstractions, improving existing models
  • 'PAFT': (P)roblem thinking, considering the (A)udience, considering (F)eatures, but have no idea what they mean by Theme

2.2 Step 2: closing doors and narrowing windows

  • Overdone ideas are talked about, though it directly conflicts with the very first step which talks about successful abstractions based on existing ideas

2.3 Step 3: transforming ideas

  • MVP = Minimal Viable Product or the bare minimum functionality you need to showcase your idea
  • Try to describe your website/app in a single sentence
    • talking about diverging your idea such as extending bathroom codes list w/wifi codes
    • example features: paying 'points' for contributing to a crowdsourced data set to encourage user participation

2.4 Editors/IDEs

VSCode is shilled, which is pretty easy to learn. For most of the beginning of the workshop I edited in Eruda (use Chrome DevTools on a desktop) directly changing html and CSS.

2.5 Summary

The features that define your software should be in the prototype you push early to your audience, with nice to have features left as a TODO. You can make this easier by being able to describe your software in one sentence, which will help make it clear what your critical features are.

3 HTML

Html:

  • is 'nested boxes'
  • DOCTYPE reason for existence is hand waved.
    • Investigation reveals doctype declarations prevent the browser from loading pages in other modes such as 'quirks mode'
      • quirks mode turns out to be a legacy feature to intentionally simulate the bugs of old browsers like IE4/5.
    • Important: anything written before a doctype declaration may trigger nonstandard behavior
    • There are now 3 modes: quirks mode, almost standards mode, and a full standards mode.
      • We will certainly want 'full standards mode' so <!DOCTYPE html>

Html tags:

  • Opening and closing tags to structure the document into containers
  • Can't violate scope, ie: close a container tag then try and close a tag that was opened inside that now closed container
    • If I open a box, and inside are envelopes that I open, I can't then close the box and then expect to close the internal envelopes from outside the box
  • Reasons for why tags are even used in the first place are in this primer:
    • The browser throws all newline indicators away and processes the html as a long string as it deems appropriate
    • Exception: tag <pre> that indicates preformatted so will display as is
    • Elements represent logical relationships rather than precise physical relationships, because people can alter their browser settings to display the page in a different physical style than what you specify
      • Ex: bold, underline, italics, may all be displayed differently but the html element for these things simply represents that text should be more prominent in some way

Clarity: What exactly is a tag vs an element? An element is what the tag represents (semantics) while the tag is syntax. For example the tag <div>, it is markup syntax for 'the Content Division element'. If I were to make a new element 'the Supreme Container element' and give it a tag <kjdfj34> </kjdfj34> it's now clear which is the element and what is markup syntax.

Html head:

  • General machine processed information about the page (metadata)
    • Slides have additional information if you look at the bottom of each one, reveals the head primarily holding information for search engines and the browser regarding title, scripts and css style sheets

Html body:

  • Represents what you see (the content)
  • Heading tags, h1 = 'first order' (top level), h2 is subsection title, h3 etc.

Html Attributes:

  • Optional attributes are inside the opening tag
  • Link tags are called 'anchor', prob for obscure netscape navigator reasons
  • ul, ol and li (list) elements discussion, dissenting opinion here why ul shouldn't be considered 'unordered'

Not covered, void elements: The reasons given in the lecture why <img> doesn't have a closing tag were vague. The HTML standard reveals this is a void element which is not allowed any content, hence no reason for requiring closing(they can still have attributes).

3.1 Avoiding <div> soup

Div soup is when you look at the source at many modern websites and see an enormous pile of div containers everywhere.

  • HTML now has semantic elements, some advice on semantic element use from the HTML working group WHATWG
    • These elements exist so software like password managers, search engine bots(SEO), pocket, accessibility assistants can parse the page, or for future developers trying to maintain the page (which could be yourself a few months from now)
    • div is defined in the lecture as a block section ie: 'making scope boxes in documents'
    • span is defined as inline sections, such as changing the colors of every letter in a single word

3.2 Let's read some of the HTML standard

MIT's advice is to 'just google', so I'm going to skim through the actual HTML standard, to clear up these questions like which semantic elements do what, what exactly div is, etc.

  • The standard insists there is only HTML and no 'HTML5', as the standard is unversioned.
  • A review of semantics
    • Another statement that html markup conveys the meaning, rather than the presentation since the browser can adjust all these things.
    • There's a comment about using grouping elements like hgroup, to organize secondary titles when using headings, so they aren't misinterpreted as a seperate section.
      • This is apparently not adopted by W3C, another standards organization, and MDN claims hgroup is 'theoretical only' linking to a W3C workaround using span

We find what we need:

  • Full table of contents for HTML elements and their description
    • A div element is strongly encouraged to only be used as last resort, with other semantic tags used instead for accessibility software
  • The <em> element is emphasis, for accessibility readers and other machines parsing html, it subsumes italics element
  • Span is exampled here as loading a translation attribute in a sentence on a single word
  • Standard also lists various attributes we will want to ensure the [0-9] number keypad pops up on touchscreens.
    • "Not all inputs that contain numbers (credit cards, SS#, etc) should have an input type of number. They are best served up as text inputs, which are still free to use the pattern attribute to pull up the tel keypad on supported devices"
    • In general the number input should only be used if it's something you're going to do math operations on, you are unlikely to add and subtract account numbers and other numeric identifiers.

I just skimmed inputs as there will be future lectures on this.

3.3 WHATWG standard or W3C aka w3.org aka w3schools.com?

Which html standard do we use? W3C forks the WHATWG standard, with inconsistent updates. A summary of why there are two different standards bodies (WHATWG and W3C) is here. Unfortunately we can't rely on WHATWG standard either, as it's a 'living document' which translates to unversioned document.

Further research indicates the canonical resource to use in order to find out what works with which browser, should be MDN, with https://caniuse.com/ and https://devdocs.io/ as secondary resources. If you search for 'most popular screen readers' you find a list of products that primarily use IE unfortunately.

4 CSS Workshop

  • YouTube - IAP 2019 Day 1 - CSS & HTML CSS Workshop

We are asked to clone this repository https://github.com/mit6148-workshops/html-css-workshop or download the .zip and expand.

We inspect these example html pages in Chrome devtools (right click, choose inspect or Ctrl-Shift-i depending on your OS) or w/Eruda.

  • Child elements of div receive the styling of the parent element
  • Overrides are talked about, the most specific naming has the highest priority
    • The lecturer is surprised no questions about overrides, and I've marked this as a TODO because they seem important to understand.
  • Padding is discussed, if you play around with the values editing them while in inspector, you'll get it immediately
  • We are asked to make a class called text-center in style2.css. Then we are asked to add the class to the div with id=profile-container. I solve this by first looking at style2.css and noticing .card was commented as a class selector so I mimicked that entry that has a dot in front of the selector, with my own new property: .text-center { text-align: center }.

Then I opened profile2.html and inserted <div id="profile-container" class="text-center"> to see if it would work, assuming because this is markup we can add a space and fill in new attributes. It works. I try it on <h4></h4> by pasting in my class attribute: <h4 class="text-center"></h4> and it works there too. Finally I see the card class entry: <h3 class="card"> and add a space, inserting my text-center class: <h3 class="card text-center"> and it works as expected.

4.1 Circle avatar CSS hack

  • profile4.html and profile4.css are an example to display a background image with a circle border.
    • this seems like a work around/hack and something we would never do in practice

Let's attempt to understand what is going on. With Chrome inspector open while displaying profile4.html, click on the div for large-profile-container, we see there is green padding on each side of the image and an empty spot exactly where the image is shown. Now examine the div for circle-avatar, the green padding covers exactly where the image is. According to this version of the CSS standard color or images can be attached to the padding with the background property, which is exactly what is specified in the circle-avatar selector. The border-radius property is obvious, just play with it's values to see how it curves the border.

The CSS for large-profile-container contains the line: "padding: 0 40%" which from MDN indicates: "When two values are specified, the first padding applies to the top and bottom, the second to the left and right." Examining the div large-profile-container again in Chrome inspector, on the bottom right where there is the box model diagram, we scroll down and see the padding-left, and the padding-right values are identical. What the percentage value has done, is taken the width of the parent div container: profile-container, and multiplied it by 40% to obtain those two values. Examining the div for circle-avatar (we're still in Chrome inspector btw), the padding-top value is exactly the same as the height of it's parent: large-profile-container because the CSS values for the circle-avatar div contains: padding-top: 100% which means multiply the height of the top container by 1 (100%) so you get the same value. Note the browser is automatically calculating this, these values will change depending on how you resize your page.

Another use for the background property is assigning a temporary color if you're scaffolding a bunch of containers by hand, just to see the layout.

4.2 Debugging CSS

The example in the lecture of adding a text-center class to everything can also be used as a debug helper, adding a brightly colored outline to html containers to see the container layout more clearly. As per MDN, outlines (as compared to border) do not move other components on the page.

4.3 How the cascade works

The convention on how overrides are handled were briefly covered, so let's find out what they are and what exactly is cascading. First we look at MDN which reveals cascading is a process in which the winning declaration is chosen, since we need to consider both user stylesheets (apparently you can write your own CSS, I've just have been enforcing reader mode everywhere), the browser's default stylesheets, and the author of the website or document CSS. These can all conflict so the cascade determines which styling should 'win'. MDN doesn't really elaborate on the decision making process and points out the !important rule all 3 (author, user, user-agent(browser)) can use to try and override all other style sheets, and that the user defined style sheet has precedence. It's recommended as a CSS author to never use !important.

Checking google scholar we see there's a CSS thesis by HW Lie. Survey papers and thesis are better than any online tutorial since they will include design decisions.

  • Let's quickly skim through the thesis
  • The chapter specific to CSS is chapter 6 so we start there
  • A few pages about syntax in detail, and a full listing of properties for CSS2
  • A list of possible values for properties, '3em' which is a number with a unit. 'em and ex' are relative to the font sizes, much like our percentage example which was relative to the parent container this size will be expressed relative to whatever font size the user has selected for their browser/reading assistant.
    • these number/unit values can also express angles '90deg', time '10ms' (ie: how long to pause) and freq '3kHz' on the aural properties, which are used by screen readers/speech synthesizers.
    • functions can be values, apparently some properties accept a functional notation when using a string presents a naming conflict.
    • numerous other values are exampled, there's an interesting formal definition for what a pixel is considered to be in CSS
    • space or comma seperated multiple values are fine

An interesting statement about properties and their settings: "Nearly every CSS property has different rules for the values on its right-hand side and it is not much of an exaggeration to say that each property's right-hand side has its own specialized language"

This leads into inconsistent percentage values which we already ran into with the CSS-avatar hack: "Percentage values are similar to relative length units in the sense that they are relative to another value. Each property that accepts a percentage value also defines to what other value the percentage is relative. Most percentage values refer to width of the closest block-level ancestor but some refer to other values, for example, the font size of the parent element."

  • Value propagation aka Cascading begins in Chapter 6.5.1
    • 3 principal mechanisms for propagation: cascading, inheritance and initial values
      • the cascade is a 'negotiation' between 3 parties (author, agent, user-agent(browser)) to ensure all element/property combinations have a value
      • if the cascading process doesn't yield a value, the parent element's value will be used
    • By default author declarations (the person who wrote the CSS) wins over user settings unless user marks !important in their style declarations
  • Interesting note: the thesis author considered a scenario where users could all share style sheets for different sites on a peer-to-peer basis, too bad this didn't become a mainstream thing where I can use a browser extension and flip through dozens of competing designs for various sites.
  • The section on creating boxes from elements describes the parsing process, making a tree, etc. It also suggests implementations can optimize processing by doing several steps in parallel
    • Apparently Stylo in Firefox's Quantum browser parallelizes their CSS rendering implementation. TODO: look at this implementation sometime
    • Design decisions are described for the '3 band' box model in CSS of Margin, Border and Padding.
      • note that negative values can be used for margin, to overlap boxes

The rest of the thesis is put on hold since we'll learn positioning in the upcoming lecture 'Advanced CSS'.

4.4 Proprietary extensions

Browsers and other software have proprietary CSS extensions. These are defined as "A CSS feature is a proprietary extension if it is meant for use in a closed environment accessible only to a single vendor’s user agent(s)".

4.5 The current CSS standard

W3C maintains the official CSS standard as snapshots so we're left with the same problem as HTML where we have to rely on software documentation (if it exists), or MDN and https://caniuse.com to figure out which software supports what CSS features.

5 CSS Frameworks

  • YouTube - IAP 2019 Day 1 - Intro to Bootstrap
  • 'Responsive' is defined as basically being able to render properly on multiple screens/devices.
  • Boostrap is a CSS framework to make being responsive easier
  • Other frameworks are covered, now we know the reason why all sites look the same
  • Bootstrap and these other frameworks essentially abstract Flexbox

5.1 CSS Grid

CSS grid is 'two dimensional' whereas Flexbox layout is 'one dimensional'. The standard gives this example of the difference, but it seems like CSS grid can reference all of row/column at the same time (adjust height, grow the row past the current row into another) and overlap elements, where Flexbox can only reference a single row/column at a time.

  • Pitfalls with CSS Grid: The legacy mobile browsers still out there on old devices, and IE version 10 and 11 which refuses to die.
    • There's a workaround for IE using Autoprefixer CSS online, but you could also use javascript to check browser version and present a more barebones page for IE until it finally goes away.

Let's start with the canonical resource for these things (that will tell us what is implemented) which is MDN.

  • We begin to read MDN web docs basic concepts of grid layout
    • all the examples can be done in one html file, resize your browser window when you check the changes to see how grid is responsive to your window resizing:
<!DOCTYPE html>
<style>
 .wrapper {
  display: grid;
  grid-template-columns: 1fr 1fr 1fr; }
  </style>
<div class="wrapper">
  <div>One</div>
  <div>Two</div>
  <div>Three</div>
  <div>Four</div>
  <div>Five</div>
</div>
  • So far this is all pretty straight forward, the value 'fr' represents a porportion of whatever space is left inside a container, so 1fr and 2fr represents 1/3 and 2/3 avail space.
    • 2fr and 1fr is translated as: "this is twice as big as the other"
  • minmax() function can also be input to repeat() as a parameter
    • repeat() parameters are repeat(num-of-repeats, the-values-to-repeat)
    • repeat(4, 1fr) will create: 1fr 1fr 1fr 1fr
    • repeat(2, 10px 1fr) will create: 10px 1fr 10px 1fr
    • repeat(2, minmax(100px, auto)) will create 2 tracks sized of min value 100px, and max value set to auto.
  • The rest of the MDN page on basic concepts of grid layout are self explanatory, overlapping boxes and using property z-index to declare their display priority over other boxes, and refers to other guides with layout examples we can look at when we actually start building something.

5.1.1 CSS Grid spec

Let's skim through the CSS Grid Layout Module Level 1 candidate recommendation. The 2018 Module 2 standard W3C published is just an addon to define subgrids and is still a working draft, so we'll read it after. An example of subgrids is given here.

  • We begin to read the candidate recommendation for CSS Grid
  • Mentioned are media queries, these are attributes you can add to your stylesheet declarations to test for certain display properties, like display aspect ratio, of the user viewing the page, and then assign them a specific stylesheet for their device.
    • An example of media queries, if orientation is landscape present the following grid, if portrait, present this other grid configuaration.
  • Motivation for grid was to replace the hacky workarounds of tables and floating elements for application layouts.
  • First example using a game screen is self explanatory, and actually a better explanation than the MDN docs
  • Grid layout is precisely defined here describing 'block axis' (column) and 'inline axis' (row).
  • There's a warning about "correct source ordering" which is defined by MDN, basically keeping the HTML semantic source logical and only using CSS grid to reorder the content visually, so a screen reader and other software can still interpret your html.
  • Important we can give names to columns and rows
    • Another article about naming, to make the CSS more flexible, same article also recommends using porportions (the fr value), instead of specific values/percentages

The rest of the CSS grid spec goes into fine details about positioning like 'gutters' the term for space between grid boxes. We'll come back to these when we do the advanced CSS lecture.

5.1.2 Common layouts by example

MDN has a guide to common layouts using CSS Grid, that use semantic html elements. Seems like there are a dozen different ways to write CSS grid stylesheets, like using wrappers (injecting some class named .wrapper into a div element, then using .wrapper as a selector to build a grid). We should probably look at some real world examples and not tutorials to see how this is done.

5.1.3 Examining a sample page

Let's try to figure out a page's CSS. I have chosen Stripe's website, specifically the payments marketing page because there's all kinds of things going on like skewed grids, and animations.

  • A list of interesting things:
    • Smartphone screenshot image labelled 'kickstarter' has been transformed/rotated to adjust perspective
      • transform: translate(-50%,-50%) rotate3d(.5,.866,0,16deg) rotateZ(-2deg);
        • transformations are described here, to understand this you need a few lectures from this course or wait until the Advanced CSS lecture.
      • the '$25' overlayed on the phone image is a span element that has also been transformed: scale(var(–device-scale)) translate(54px,620px)
      • –device-scale is a CSS variable
    • Navigation up top is also using some kind of transformations, to slow/smooth the display of the next menu option.
      • translateX() seems to accomplish this, instead of disappearing and producing a new window, it shifts the old window horizontally with some kind of delay for smoothness/effect
    • Animations used for the 'recent updates' boxes, and shadowing for effects
      • These cards seem to be stacked boxes overlaying each other, with shadow effects on their borders, animated using the translateY() function, so like the navigation menus that shifted right, these are shifting up with a transition-delay property
    • The images for 'Grow Faster' and 'Developer-Centric' are interactive
      • It's using an event listener, which we haven't learned yet
      • Again we see there are translateX() function transformation w/delays, in incremental negative values, to give it a smooth transformation from right to left.
    • Background looks like an affine grid plane, which is equally spaced parallel lines in two different directions, with some parts colored to make stripes
      • see below

Let's mimic the stripes. Inspecting any part of the page in Firefox or Chrome Developer Tools, click on the div with class "lower-page" and then it's child div with class "grid" to see the grid, on the right hand side under layout. Clicking on each grid reveals it's construction, the slanted grid lines to paint the stripes look interesting. I try: transform: skewY(-0.06turn) (which I found in MDN documentation). Then I try to color the d d d area background, using shorthand grid-area values: selecting grid row 2, column 1, spanning 'only grid 2' (unsure about grid-row-end), and going from column line 1 (far left of grid) over 4 column lines to create a similar effect, a background colored stripe on the page. Changing these values helps to understand them, except 'grid-row-end', which doesn't act like I think it should when I change it's value from 1 to 3. Looking at this I now get what it does. We can also change the value to span 2 or span 3 to create a thicker stripe.

<!-- creating background stripe grid test -->
<!DOCTYPE html>
<style>
.wrapper {
    display: grid;
    grid-template-columns: repeat(12, 1fr);
    grid-auto-rows: minmax(100px, auto);
    grid-template-areas:
     "h h h h h h h h h"
     "d d d m m m m m m"
     "ft ft ft ft ft ft ft ft ft";
     transform: skewY(-0.06turn);
}
.box1 {
grid-area: 2 / 1 / 2 / 4;
background-color: lightblue;
}
</style>

<div class="wrapper">
  <div class="box1"></div>
</div>

6 CSS/HTML Exercises and review from 6.170

See these slides from MIT's 6.170 for a review of what we've just learned. There's a one-to-one correspondence between HTML elements and DOM nodes where HTML attributes are stored as properties of DOM nodes.

  • Clear examples of the HTML tree
  • Clear examples of CSS selectors
    • * is the universal selector, selects all elements
    • #id selects all id attributes ie: #header selects <div id="header">
    • .class as we already know selects elements by their class attribute ie: .photo <div class="photo">
    • some notes on pseudo-classes to select interactive states like :hover, :checked, :focus, etc.
  • Some notes on CSS combinations, which lead into exercises below
  • Cascade specific 'score values'
    • +1000 if styles specified inline
    • +100 for every #id
    • +10 for every .class, [attr], :pseudo
    • +1 for every element
  • Box Model is reviewed
    • absolute values: px, pt, cm are exact measurements that don't respect user browser settings
    • relative values: em, %, vw are calculated which we already knew
  • Flexbox review
    • display: flex;
    • .item {flex-grow: 1;} evenly grows to fill the avail space of each .item
    • #selector {flex-grow: 2;} grows one selector to ratio 2
      • flex-shrink: 1; will evenly distribute negative space aka shrink
    • Short hand is covered, which we already know from Grid, flex: value value value is documented at MDN for grow, shrink, basis etc.

6.1 CSS Selector exercises

Let's do some selector CSS exercises from: https://flukeout.github.io

  • I get to level 8 and then get stuck on the small oranges in the bentos selections which I figure out are 'bento orange.small'
  • I get to level 16 then get stuck on the :only-child selector, it seems like it should be plate:only-child
    • Answer turns out to be plate *:only-child
  • Stuck on level 30, however typing in plate[for^="Sa"] selects the plate, so we know what to do: plate[for^="Sa"], bento[for^="Sa"]

6.2 Flexbox exercises

  • These recitation slides have exercises if you wish to practice flex, such as FlexBoxFroggy.
  • If you're completely stuck on the last exercise:
    • flex-direction: column-reverse;
    • flex-wrap: wrap-reverse;
    • justify-content: center;
    • align-content: space-between;

7 Git and the command line

  • YouTube - IAP 2019 Day 2 - Command Line & Git

Begins with standard command line introduction, there's some other slides here from Cornell with more content on package management, command flags, etc. Surprisingly Gitless isn't covered, which is a simplified user interface for Git. Git is best understood if you know the basics of what a DAG is (a tree-like graph without cycles), there's also this tutorial by example: A Hacker's Guide to Git. Not discussed is Gitlab, or Bitbucket, other version control repository hosting services with various features. There is also mercurial and numerous other version control schemes. There's some UIs for git, probably the best of them is magit. There are also text interfaces like Tig. Note emacs has a GUI, you can access pull down menus for all of magit's features which will also show you the keyboard shortcuts.

8 JavaScript

  • YouTube - IAP 2019 Day 2 - JavaScript
  • Let statements instead of using var, Let is local scope while var is 'scope of the entire function'
    • const are immutable variables (constants)
  • Some talk about arrays in JS, they look like lists you can index
  • A JS object is a collection of key-value pairs so basically a struct in other languages
  • Equality is three 's where = coerces both items to be the same type
    • A table of equality boolean outcomes here
    • this tests value vs value, or memory reference to see if objects point to same reference
  • A JS callback function is somewhat defined.
    • an example of map is used, JS has anonymous/lambda functions

Clarity: JavaScript callback functions: a function that is to be executed after another function has finished executing. These are first-class continuations.

Manipulating HTML:

  • document.getElementById('identifier');
    • gets HTML elements, and seems to return them as some kind of object/structure where you can access various fields using .notation, such as .innerHTML
  • document.createElement('div');
    • can create elements, append their inner fields to other elements.
  • button.addEventListener('click', function() { alert("Hi"); }); adds an event listener/action.
  • Adding scripts to HTML documents:
    • <script src="source.js"></script> must be at the end of html document

9 Workshop 0: JavaScript

  • YouTube - IAP 2019 Day 2 - JavaScript Workshop

The weblab.to link doesn't work, instead clone catbook-workshop0. These exercises if you're lost look at the answers in the slides and play around with the values/inputs, maybe add another feature since hardly any js has been covered yet.

10 JavaScript Exercises and basics review from 6.170

Let go through the slides for JS basics

  • Official(standardized) name for JS is 'ECMAScript'
  • Numbers are 64-bit floats (no ints)
  • Booleans, null values, undefined, strings have no separate char type
  • arrays a = [];
    • grow and shrink dynamically, use length property for size: a.length;
    • array methods for various manipulation like a.push("test"); or a.pop(); a.indexOf("hello");
    • a = ["hello", 2, null, [1, 2], "there"];
    • a[number] array notation to access that value
  • Objects a = {hello: "there"};
    • access properties with dot notation a.hello
    • can change these properties a.hello = "hi";
    • values can be nested b = {qty: 3, item: {name: "crayon", price 5}};
  • Interpreter determines types of values at runtime with implicit type coercion
  • Scope:
    • by default variables are defined in the global scope unless let or const keyword used (block scope) ie: between { and }
    • const variables cannot be reassigned, but their properties can be changed
      • const STUFF = {name: "test"};
      • STUFF.name = "test2";
    • var keyword has entire function scope, using hoisting where their declarations are moved to the top of the function body to cover all block scope within that function

JS functions: This chapter helps to explain things like arbitrary numbers of arguments. Multi-argument functions in most languages are really single-argument functions that take a tuple as an argument.

// declared with a name, arguments(parameters) and a body
function multiply(a, b) {
   return a * b;
}

//can take a variable number of arguments that don't have to be specified, access these with array index notation
function multiply() {
   return arguments[0] * arguments[1];
}


//arguments can have default values
function multiply(a, b = 1) {
   return a * b;
}

//functions are values, can be assigned to variables
let multiply = function(a,b) {
   return a * b;
}

//can be passed as arguments, ie: first class functions
let toys = [
 {name: "Woody", price: 10},
 {name: "Rex", price: 3}];

toys =
   toys.filter(function(toy) {
      return toy.price < 9;});

10.1 Functionals:

See this chapter on functionals explaining how map/reduce/filter work.

  • "functionals" aka first class functions, are functions that take another function as input and return a new array or value, leaving the old one unchanged
    • list.filter(function(item) { … });
      • filter function that tests each element of the array and constructs a new array from whatever values pass the filter
    • map also available in js: list.map(function(item) { … }) which applies a function to every element, returning a new array
    • list.reduce also available for a single output value, using an accumulator: list.reduce(function, init);

Clarity: JS arrays look and act like lists, but are just high level objects, not actually linked lists (or even real arrays for that matter).

10.2 Functionals exercises

These start on slide #58. The recitation slides have optional exercises to practice functionals/closures.

  • 1. Create a double function. In the console I type:
    • function double(x) { return x * 2 };
    • a = [2, 3, 5, 7];
    • a.map(double)
  • Let's try it with a lambda function/anonymous function literal
    • a = [2, 3, 5, 7];
    • a.map(x => x * 2);
  • 2. Sum an array of values
    • So we want reduce
      • function getSum(acc, num) {return acc + num;}
      • a = [2, 3, 5, 7];
      • a.reduce(getSum, 0); (zero is the starting value of the acculumator)
      • Also works: a.reduce((acc, num) => acc + num, 0);
        • With arrow functions () => x is short for () => { return x; }.

10.3 Closures

This chapter on closures vs object will clarify these slides.

There's an example on slide #62 of pairs(list1, list2, f) function being passed another function as an argument in order to encapsulate the iteration and provide the ability to reference items instead of direct array indices to avoid mistakes. Closures are then covered, using a higher-order function which is a function that returns another function. If you're reading the slides in say, chrome or firefox pdf reader, right click inspect element or inspector, click on console and enter the code you see to understand what it's doing. Type 'hello;' into the console to see it's construction: f(msg){ console.log(prefix + msg);} which is how hello("arvind"); works. There's an explanation of closures here but for clarity let's refer to PSML:

..the concept of a closure, a technique for implementing higher-order functions. When a function expression is evaluated, a copy of the environment is attached to the function. Subsequently, all free variables of the function (i.e., those variables not occurring as parameters) are resolved with respect to the environment attached to the function; the function is therefore said to be “closed” with respect to the attached environment. This is achieved at function application time by “swapping” the attached environment of the function for the environment active at the point of the call. The swapped environment is restored after the call is complete.

There's a memoization exercise in the slides. This is usually covered in an algorithms class (or PAPL book). What it's doing is cacheing the previous computations in a data structure. When you use it for Fibonacci as in the example: memoizedFib(40), the memoized function calls fib(40), and stores the result in the array we declared. The next time you use that function, memoizedFib(40) to call fib(40) again it will look up the entry for index 40 in the array and then produce that previous value instead of having to do exponential work calling fib(40) all over again. Note that it is a generic memoization function, anything else that maps from one input to one output (like calculating factors, or primes) can use the same memoization function, such as newfactor = memoized(factor) or fastprime = memoized(sieve).

Mixins are briefly shown, which won't make any sense unless you've taken an OOP class or PAPL. You may not need the 'this' keyword.

11 JavaScript: DOM Manipulation exercises and review from 6.170

The first part of these slides cover OOP constructors such as Prototypes, which are explained here. I don't think either course has fully described what the document object model(DOM) is, so we'll look at the standard for it later.

  • Slide #37 we come to DOM manipulation
    • We can select a DOM node w/selectors (except pseudo-classes like caption:hover):
      • document.querySelector("#header"), document.querySelectorAll(".caption")
      • Note these are CSS selectors, so everything we learned in CSS about selectors can be applied here
    • An element is a DOM node that corresponds to an html tag like <p> or <h1>
    • A node is a generic DOM node class for text, comments, etc.
      • element.tagName, element.id, element.innerHTML, node.nodeType, node.nodeValue
    • Manipulating properties: element.style.color = "blue";, element.style.borderRadius = "10px";
    • Caveats: string cat has poor perf due to browser has to parse the string then reconstruct DOM subtree
      • cannot insert element between two others, can't return references to new elements
  • The slides walk through an example of adding a new link to the tree after h1:
    • let link = document.createElement("a"); (create a new element)
    • link.href = "http://eecs.mit.edu"; (create internal .href property)
    • let label = document.createTextNode("EECS"); (create a new DOM/tree label)
    • link.appendChild(label); (append the label to 'a')
    • let body = document.querySelector("body"); (grab the body DOM tree object, assign it to a variable for manipulation)
    • body.appendChild(link); (append our new 'a' node to body)
  • Inserting before a node:
    • let newChild = document.createElement("p"); (create a new element again called p)
    • let parent = document.querySelector("body"); (create a variable called parent that represents the html body)
    • let sibling = document.querySelector("h1"); (create a sibling variable, select the element 'h1')
    • parent.insertBefore(newChild, sibling); (insert our new element p before h1)
    • parent.replaceChild(newChild, sibling);
    • parent.removeChild(newChild, sibling);
  • There are multiple methods to traverse the tree: .previousSibling, .nextSibling, .nextElementSibling, etc.

11.1 DOM manipulation exercise

We're asked to go to example.com, open the console, write three ways to select the "More information" link, one of which must be a traversal, and inside the white box add a third paragraph of text. Finally add a 2nd level heading before the paragraph we added. They have different solutions in the slides, notably the ('p > a') selector and using .children to walk the tree.

  • Select <a> directly:
    • let link = document.querySelector("a");
    • link.href
  • Traverse to <a>:
    • let parent = document.querySelector('p'); (grabs the first paragraph above the link)
    • let traverse = parent.nextElementSibling;
    • traverse.firstElementchild.href
  • Another way to select the href:
    • let b = document.querySelectorAll('p');
    • this selects every 'p' and places into a NodeList, which we can access using array notation []:
    • b[​1].firstElementChild.href
    • b[​1].innerHTML also reveals the link
  • Let's add a paragraph:
    • let firstP = document.querySelector('p'); (grabs first paragraph)
    • let newChild = document.createElement('p'); (create a new <p>)
    • firstP.appendChild(newChild);
    • newChild.innerText = "Here's another paragraph"
  • Let's add the second level heading:
    • let newHeading = document.createElement('h2');
    • firstP.insertBefore(newHeading, newChild);
    • newHeading.innerText = "This is a new Heading";

Let's finish the slides from 6.170, starting on slide #77 for JavaScript events.

  • JS is single threaded
  • blocking is a problem for long running operations
  • JS has a trad call stack and a message queue for events such as clicks, timers, etc.
  • when the call stack is empty, the event loop evals FIFO (first in, first out)

Running their example shows how events (timeouts) are deferred:

function one() {
 console.log("one");
 setTimeout(two, 0);
 console.log("three");
 setTimeout(four, 0);
 console.log("five");
}
function two() {
 console.log("two");
}
function four() {
 console.log("four");
}
  • setTimeout(function() {…}, ms);
    • Executes the given function after a min delay in milliseconds
  • let timerID = setInterval(function() {…}, ms);
    • Executes the given function at a given interval in milliseconds
    • clearInterval(timerID) to stop the timer
  • Listening for interaction events:
    • let elem = document.querySelector("h1");
    • elem.addEventListener("click", function(event) {…});
  • Browser creates events + adds them to the queue when h1 is clicked
  • When the event is processed from the queue, the registered listener function is called.

11.2 Event exercises

We're asked to change the font color of the h1 heading everytime we click it, and set a timer to change the background color of the 'Example Domain' box.

// returns a random color as string "rbg(x,x,x)"
function randomRGB() {
    let rgbArray = [255,255,255];
    let rgb = rgbArray.map( x => x * Math.random() );
    return 'rgb(' +rgb[0] +',' +rgb[1] +',' +rgb[2] +')';
}   

// change h1 to a random color on every click
let h1 = document.querySelector('h1');
h1.addEventListener( 'click', function randomColor() { 
    h1.style.color = randomRGB();
});

// timer to change Example Domain div background color randomly every second
let exampleBox = document.querySelector('div');
let timerExampleBox = setInterval( function randomBackground() { exampleBox.style.background = randomRGB();}, 1000);
  • Event Handling Gotchas: Context
    • ES6 'arrow functions' are needed to access the variable counter inside the same function, or else that event listener function will create it's own environment/context
function Counter() {
   counter = 0;
   let b = document.createElement("button");
   b.innerText = "Increment me!"
   document.body.append(b);
   b.addEventListener("click", () => { counter += 1; console.log(counter); });
 }
counter();

12 Intro to Asynchronous Programming

  • YouTube - IAP 2019 Day 3 - Intro to Backend

Some basic information here on HTTP requests and response, how async code runs vs synchronous execution.

12.1 HTTP Request & Response from 6.170

Let's review the slides for web servers.

  • A clear presentation of query strings, path, protocol, host and URL fragments
    • simple webserver is presented
    • let http = require('http'); is how you import modules in js (won't work in browser console without hacks)
  • Intro to cookies/sessions

…and that's it. The rest of that lecture is a NodeJS project which we'll do later.

13 Workshop 1: Client Side Javascript + Async

  • We're working with git clone https://github.com/mit6148-workshops/catbook-workshop1.git
  • Slide #8 questions: Let's walk through api.js since they only give a brief description.
    • First, see these slides starting on slide#6 about url parameters from a CMU data science course.
    • From the slides: You've seen URLs like these: google.com/url?sa=t&rct=j&source=web…
    • Everything after ? is parameters given to the google api. The '&' character seperates them.
    • Read the rest of the slides about REST APIs, JSON data

api.js explained:

function get(endpoint, params, successCallback, failureCallback)
  These parameters are:
   - endpoint = the base URL ie: https://google.com
   - params = whatever you want to send to the API "userID=1000"
   - successCallback, failureCallback = optional functions that you define/pass as args 

 const xhr = new XMLHttpRequest();
   We're making a constant variable xhr, and assigning to it a new object XMLHttpRequest().
   You can try this in the console: const temp = new XMLHttpRequest() then type temp; to look at the object.
   MDN has a full list of all it's 'methods', such as XMLHttpRequest.getResponseHeader(); 

 const fullPath = endpoint + '?' + formatParams(params);
   Creates constant variable fullPath which will be baseURL + ? + parameters
   Example: fullPath = https://google.com/?userID=1000
   Note, another function is called, formatParams(params) which encodes the parameters using encodeURIComponent()
    - This is basically to escape characters using UTF-8 encoding

  xhr.open('GET', fullPath, true);
   Explained here: https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/open
   The boolean is an optional parameter that defaults to true because you don't want synchronous requests
    - Synchronous meaning your script has to wait for a reply, possibly freezing the user's browser
  
  xhr.onload = function(err) {
    if (xhr.readyState === 4) {
      if (xhr.status === 200) {
        if (successCallback)
          successCallback(JSON.parse(xhr.responseText));
      } else {
        if (failureCallback)
        failureCallback(xhr.statusText);}}}
   
    Here is where our optional functions are. If the request returns http status 200 (OK) then 
    the if-branch tests if you actually gave a success function as an argument, if(successCallback) determines if that parameter exists.
    If you did, then it calls your function with args (JSON.parse(xhr.responseText)), if you didn't then nothing is run. 

    In the workshop the successCallback function is: function(stories) { console.log(stories) };
    This will populate the console with xhr.responseText, which is then parsed by JSON.parse: 
    function(JSON.parse(xhr.responseText)) { console.log(the output of JSON.parse) };
      
    Try this yourself in the console, giving no success/failure function: 
     const API_ENDPOINT_START = 'http://google-catbook.herokuapp.com';
     let atest = get(API_ENDPOINT_START + "/api/stories", {});
     atest; (undefined)
    
    This is what the slides mean by 'nothing is returned' by the get/post functions, unless you specify success and failure functions in 
    the params. 

  xhr.send(null);
    This is explained here: https://stackoverflow.com/questions/15123839/why-do-we-pass-null-to-xmlhttprequest-send

Let's also look at their function formatParams():

 function formatParams(params) {
  return Object
    .keys(params)
    .map(function(key) {
      return key+'='+encodeURIComponent(params[key])
    })
    .join('&');
}
  First, they have broken up the return line for readability: 
   return Object.keys(params).map(function(key){ return key+'='+encodeURIComponent(params[key]) }).join('&');

  encodeURIComponent() you can look up on MDN, it's escaping characters. 
  Object.keys() you can also look up on MDN, it's returning an array of keys:
   
   let params2 = {}; //empty dict
   params2.userID = "100";
   params2.name = "Bob";
   console.log(Object.keys(params2));
    -> (2) ["userID", "name"]

  let test = Object.keys(params2);
  test[0];
   -> "userID"
  params2["userID"]
   -> "100"
  
 So the return value is print the key, concat '=' with the UTF-8 encoded value of params["key"], and join them with '&':
 formatParams(params2);
   -> "userID=100&name=Bob"
  • More talk/slides about async vs synchronous behavior
    • clicking on Network while in Chrome Inspector lists the get reqs queue/history
    • Guaranteeing order: nested requests/callbacks
    • Design around callback hell of deeply nested get requests
  • First task is to add renderStories() to index.js, and then fill in the skeleton in feed.js for renderstories()
    • We can see that in index.js there's a main function, underneath where the navbar is rendered add: renderStories();
    • Then in feed.js complete function renderStories() with simple output to console
      • Refresh index.html and look at the console, the object filled with stories should be there.
      • you would have console.log(a = stories) if you wanted it bound to a variable to play around with

I tried a different version of renderStories() than the lectures:

function renderStories() {
    let newDiv = document.getElementById("stories");
       
    get(API_ENDPOINT_START + "/api/stories", {}, function(stories) {
        stories.map(story => newDiv.prepend(storyDOMObject(story)) )});
}

I also wrote mine so comments so would be pulled while inside StoryDOMObject function since the needed JSON id field to fetch the comments is in scope. Of course in production you would never do this because of possible failures of comment API, which we aren't even handling by omitting the failureCallBack arguments. You likely wouldn't design api.js like they did either.

function storyDOMObject(storyJSON) {

    const storyDiv = document.createElement('section');
    storyDiv.setAttribute('id', storyJSON._id);
    storyDiv.className = 'story';

    const cardBody = document.createElement('article');
    cardBody.className = ('story-content');
    cardBody.innerText = storyJSON.content
    storyDiv.appendChild(cardBody);

    const creator = document.createElement('footer');
    creator.className = ('story-creator');
    creator.innerText = storyJSON.creator_name
    cardBody.appendChild(creator);

    get(API_ENDPOINT_START + "/api/comment", {parent: storyJSON._id}, function(comments) {
        comments.map( comment => creator.append(commentDOMObject(comment)) ) }); 

    return storyDiv;
}

You'll probably have to change the default CSS, adding some background color to the comments and stories helps distinguish them as the default CSS is terrible.

13.1 Web APIs from 6.170

Let's review the slides for web APIs.

  • REST: Representational State Transfer
  • The idea is URLS identify a representation of a resource
    • Path hierarchies to imply structure
    • GET is the only 'safe' call guaranteeing data will not be modified
    • APIs provide a contract, add versions to API URLs when changing them
      • There will be many more lectures on good API design
  • AJAX: Async Javascript XML (JSON these days)
  • XMLHttpRequest() object is covered again
    • xhr.onload = function() { let response = JSON.parse(xhr.responseText); }
    • response handler is registered before making the req as we already know
    • xhr.open("GET", "/teas/green", true); boolean is covered again, to make in asynchronous
  • Continuations (callbacks) are discussed:
    • get returns immediately, but the callback function isn't run until the response returns
    • The continuation is called at some arbitrary point in the future outside the normal call stack
    • Cannot capture the result with a normal return statement
    • Side effects: introduce additional external state (let greenTeas;), set the state within the continuation
      • complexity increases
      • callback hell again
  • Promises: a proxy object, returned synchronously, for a value that will be determined some time in the future
    • Also covered in Functional-Light JS with some better examples
    • Chains execute async operations one after another by returning a new promise within a Promise.then()
      • The idea is to start a new operation with the previous result

MDN has an article on Promises and their usage.

// Using Promises for XMLHttpRequest()

function get(url) {
  return new Promise(function(resolve,reject){
    let xhr = new XMLHttpRequest();

    xhr.onload = function() {
      resolve(xhr.responseText);
    };
    // Make the req
    xhr.open("GET", url, true);
    xhr.send();
   });
}

13.2 Exercises: Callback Hell & Promises

We're asked to load example.com again and using both nested callbacks, and chained promises:

  • Set the body background color after 1 second
  • Set the div background color 1 second after we set the body background
  • Set the p font color 1 second after we set the div background
// nested callback method
setTimeout( function() {
  document.body.style.background = 'green';
    setTimeout( function() {
      document.querySelector('div').style.background = 'yellow';
        setTimeout (function() {
          document.querySelector('p').style.color = 'blue'; }, 
        1000); }, 
    1000); }, 
1000);

// Promises method
let setColorPromise = function(selector, property, color) {
   return new Promise(function (resolve, reject) {
     setTimeout(function() {
      document.querySelector(selector).style[property] = color;
      resolve();
     }, 1000); 
    })
}

setColorPromise('body', 'background', 'green')
 .then(function () { return setColorPromise('div', 'background', 'yellow'); })
 .then(function () { return setColorPromise('p', 'color', 'blue'); })

Note that the return statement is needed "Important: Always return results, otherwise callbacks won't catch the result of a previous promise". Try removing them and see what happens.

14 The Document Object Model (DOM) API standard

Let's find out what the DOM API actually is. This spec should be what most browsers have implemented but of course we'd actually have to manually check each browser's documentation.

  • We begin to read through the WHATWG DOM spec
    • it's another 'living standard', so unversioned, it keeps adding features and stays the same name. This means you can't find definitive implementations "This browser fully conforms to DOM API v5.1".
    • Some standard architecture on trees, ordered sets, selectors, namespaces
  • DOM Events:
    • We already know some of this, except for synthetic events: applications dispatching events themselves
      • Whoever wrote this filled it with old memes, the absolute state of current year standards documents
      • Sometimes when you click on a definition it navigates via URL fragment to the same paragraph you're already reading
    • The entire interface for Event is shown, and a list of methods, plus CustomEvent, EventTarget.
      • Interesting note: browsers need to use special legacy pre-activation/cancel algorithms to implement checkbox and radio input elements
      • Another interesting note about dispatching event listeners, for example old DOM api's implementing touch and wheel events which blocked the async scrolling
    • There's a very long step by step spec for how browser's can dispatch an event
    • In firing events, initializing IDL attributes are mentioned: IDL is Interface Definition Language/Web IDL spec.
      • Firing in DOM context: creating, initializing, and dispatching an event.
      • Important: An event signifies an occurence, not an 'action'. They don't start an algorithm they influence an ongoing one with notifications.
      • A section on capture and bubbling, events traverse the tree towards the root in order to find a handler in most cases, in special or defined cases, events start at the root and move towards the event target (the originating tree node), to find a handler.
      • Shadow DOM is also specified, it's a DOM object for things normally not selectable in 'light' DOM, such as play buttons for <video></video>.

We finally arrive at 'Introduction to The DOM' in section 4

  • As we knew markup goes in, a tree filled with HTML elements is constructed in the browser
    • There's a DOM Viewer which is somewhat more clear than Chrome DevTools
  • A document tree is described, and tree order
  • A lot of notes on HTML slots
  • A large section on how insertion/pre-insertion algorithms work
  • We get a list of methods we can run on collections, elements and nodes:
    • node.children returns child elements (seen in the 6.170 slides)
    • node.firstElementChild returns the first child that is an element or null otherwise
    • node.lastElementChild returns the last child that's an element or null
    • node.prepend(nodes) inserts before the first child of node
    • node.append(nodes) appends to the last child of node
    • node.querySelector(selectors) returns first element that is a descendent of node
    • node.querySelectorAll(selectors) returns all element descendents of node
    • node.previousElementSibling returns first preceding sibling that is an element
    • node.nextElementSibling returns the first following sibling that is an element
    • node.before(…nodes) inserts nodes before node
    • node.after(…nodes) inserts nodes after node
    • node.replaceWith(…nodes) replaces node with nodes
    • node.remove() removes
  • A collection is an object that represents a list of nodes, like NodeList which we already saw or HTMLCollection
    • NodeList has various methods: NodeList.length, NodeList.item(index), NodeList[index]
    • HTMLCollection and NodeList are old-style collections/depreciated, they could disappear from browser APIs
  • There's mutation observers we can use to log or run callbacks whenever a target mutation occurs
  • node.CloneNode([deep = true]) will clone the node and return all it's descendants
  • .createHTMLDocument(title) returns a document w/basic tree constructed including a title
    • useful for generating HTML should we ever want to do that
  • A useless feature: hasFeature() originally would report if the user agent supported a given DOM feature, but was so unreliable they depreciated it with hasFeature() = true by default to not break legacy pages.
  • Element nodes are just elements
    • Long list of intersting element methods:
      • insertAdjacent method returns nothing because 'it existed before we had a chance to design it'
  • Introduction to "DOM Ranges"
    • Range objects represent a sequence of content within a node tree.

In this example we are cutting/concatenating the two text lines into "syndata is awes":

//Tree:
    Element: p
        Element: <img src="insanity-wolf" alt="Little-endian BOM; decode as big-endian!">
        Text:  CSS 2.1 syndata is 
        Element: <em>
            Text: awesome 
        Text: ! 

//Range example:
var range = new Range(),
    firstText = p.childNodes[1],
    secondText = em.firstChild
range.setStart(firstText, 9) // do not forget the leading space
range.setEnd(secondText, 4)
// range now stringifies to the aforementioned quote

The rest of the document covers Traversal interfaces/methods, which we already saw except NodeFilter which we can use to iterate over NodeIterator and TreeWalker objects and filter results.

14.1 The DOM explained via example

After reading the spec, the DOM we now know is literally the HTML document, modelled as various objects interacting with each other. Events are (usually) initiated by the user-agent (browser) and will in most cases, propagate up the tree by 'bubbling' until the event finds a handler that is listening for it. Since the DOM is the HTML, any attributes can be set, and document elements manipulated except for the shadow tree which are elements that are normally not able to be captured, such as the play button on <video> elements.

For example let's look at HTML video. See this question/solution here with the problem trying to get a local .vtt subtitle file to load in a browser. Note that browsers have security settings to prevent any loading of local files unless you manually override these settings. The solution was to encode the subtitles in the page as data and then create a URL object to attach to the <track> element as the new source of the subtitles.

The original HTML for embedding the video, to display locally that failed:

<video controls preload="none" width="800" height="600" poster="test.jpg">
<source src="test.mp4" type='video/mp4' />
<track kind="subtitles" src="test_EN.vtt" srclang="en" label="English"></track>
<track kind="subtitles" src="test_FR.vtt" srclang="fr" label="French"></track>
</video>

The solution in the HTML, was to give <track> an id, select it with Javascript, and fill in the <track> attributes to point to the raw data subtitles:

<video controls>
  <source type="video/mp4" src="videoFile.mp4">
  <track id="subtitle" kind="subtitles" srclang="en" label="English">
</video>

<script>
var subtitle = "V0VCVlRUDQoNCjENCjAwOjAwOjI4Ljg5NSAtLT4g...";
subtitle = window.atob(subtitle);
var subBlob = new Blob([subtitle]);
var subURL = URL.createObjectURL(subBlob);

document.getElementById("subtitle").setAttribute("src", subURL);
</script>

Since the DOM is the document modelled a different way, we can set the src to a new URL object that points to the data. In fact you could leave off all HTML attributes: <track id="EnglishTrack"> and fill in the rest of the missing attributes by attaching them to that element in the DOM. If you were to open this in your browser, and enable in DevTool settings: "show user agent shadow DOM" you would see the shadow tree in Chrome DevTools for the <video> element how it renders play buttons and other media controls.

15 Workshop 2: NodeJS: Set up your own server

This workshop is using app.js from here which is empty on the master branch (see the completed branch after the workshop).

NodeJS:

  • is a js runtime engine
  • Node Package Manager (NPM):
    • Many security considerations, the 6.170 course will likely cover some of these like malicious code injected into package dependencies and package change of ownership via selling to some adtech or other shady outfit. You have to carefully control when you update any packages (lockfiles).
    • Dependency complexity, npm packages are often micro modules so a package can have hundreds of dependencies
      • example: this pkg really has only 1 line of code when you omit all the exceptions. It even imports append-type, another micro module.
      • the tests don't even catch an out of bounds value for (end - start), if you wrote this yourself, you would probably catch it
  • exclude node modules inside .gitignore file in the project directory
  • 'npm install' will reinstall node modules after cloning a project, or adding a new library

Rest of the workshop walks through a basic server. He starts it with 'node app.js' then enter localhost:3000 in your browser. Kill the server from the terminal with CTRL-C.

16 Workshop 2: Catbook using Express

We are cloning this repository (run npm install after)

  • nodemon: this can also be a simple editor after-save-hook if you want or a bash script with 'pkill node' and w/e commands to restart node, assign it to a keyboard shortcut.
//emacs.el entry, assuming you're using emacs/js2-mode
(add-hook 'after-save-hook
  (defun restart-node()
    (when (eq major-mode 'js2-mode)
      (call-process-shell-command "sh /home/username/noderestart.sh &" nil 0))))

This workshop is a simple walkthough of setting up a router. This is documented on expressjs.com and the 'middleware functions' are also documented to understand the req/res objects.

16.1 Recitation Node & Express from 6.170

Following the slides here

function listAll(fields) {
    axios.get('/api/shorts') 
        .then(function (response) { return showObject(response.data); });
}         

function updateOne(fields) {
    axios.put('/api/shorts/'+ fields.name, fields)
        .then(showResponse)
        .catch(showResponse);
}

function deleteOne(fields) {
    axios.delete('/api/shorts/'+ fields.name, fields)
        .then(showResponse)
        .catch(showResponse);
}

Exercise, use express-session. We haven't covered this yet, I mainly went through github documentation. There is a web.lab lecture later on authentication, this is hacky unsafe auth for playing around purposes:

//add session middleware to app.js
const session = require('express-session');
app.use(session({ secret: 'dontUseInProduction'}));

//update routes/user.js to set the user's username in the session
//this is prob wrong, it worked anyway
router.post('/signin', (req, res) => {
    const sess = req.session;
    sess.username = req.body.username;
    res.write('Hello'+ sess.username);
    res.end('done');
});

//update routes/shorts.js to use the session username and populate creator field
//You add req.session.username anywhere there is a Shorts object being updated
router.post('/', (req, res) => {
  if (Shorts.findOne(req.body.name) !== undefined) {
    res.status(400).json({
      error: `Short URL ${req.body.name} already exists.`,
    }).end();
  } else {
      const short = Shorts.addOne(req.body.name, req.body.url, req.session.username);
    res.status(200).json(short).end();
  }
});

Step 6, we're asked to try/think about ways our toy url-shortener is insecure. For example anybody can rewrite the url's to redirect to their malicious ad site, and nothing is validated by our api. To see how to do this go through some of the SEI CERT coding standards for other languages how they whitelist inputs or look at the documentation of validate.js

17 Workshop 3: Hooking it up with MongoDB

Relational dbms is briefly covered, but there is an entire open course from CMU if you're interested.

  • Instead of using the community version I signed up to Atlas to try it
    • Surprisingly easy, no email verification annoyance or even impossible captcha
    • They've already changed the UI since this workshop but easy to figure out
  • Clone the repository as usual, looks like mongoose-snippets is kill
  • Schemas are barely talked about but we'll do two recorded lectures from 6.170 about schema design
  • JSON is used instead of SQL to describe document objects.

This is just follow the slides/lecture and implement some basic features, based on other features they've already implemented. If you git reset to stage3 and accidentally run npm start again like I did without changing db.js to your own Atlas credentials you'll end up connecting accidentally to the class demo db and populating your front page with 'btw, you forgot to change db.js gj'

18 Relational databases

None of the SQL slides from 6.170 course will make any sense unless you watch a few lectures from CMU's dbms class such as the intro to the relational model lecture describing foreign keys, normal form, etc. So let's do that now. SQL is still one of the most valuable skills.

18.1 CMU 15-445 Relational Model

  • YouTube - Relational Model (Fall 2019 playlist)

This is a course about designing and building a disk storage db management system, but we're going to audit the first couple of lectures. Some set notation for relational algebra is used, this can be learned from Chapter 2.1 of this free book or watch some of this lecture, or read through this brief tutorial.

  • Lecture starts around 18:30
  • Some talk about problems of flat file/csv data implementations
  • 'Relation' means a table
  • A schema is a definition of what you're storing in the db
  • A record (row) is a tuple, when you declare a function in a programming language (ex: def fun(x, y)) the parameters are also a tuple
  • An attribute (column)
  • In the example for primary keys, the entire key is Artist(id, name, year, country) not just the id number
  • Foreign key is an attribute from one relation that maps to another relation (a different table)
  • Relational Algebra is based on sets (union, intersection, difference)
    • Select: filter (\(\sigma\) sigma) in SQL: where clause
    • Projection: reorder/manipulate values (\(\Pi\) pi) in SQL: SELECT bid-100, aid, FROM R (select operator)
    • Union: R \(\cup\) R concatenate sets in SQL: UNION ALL
    • Intersection: R \(\cap\) R contains only the set that appear in both relations, in SQL: INTERSECT
    • Difference: R - S a relation that contains only the tuples in the first and not the second relation in SQL: EXCEPT
      • relations must have exact same type/attributes
    • Product: R x S a cartesian product, in SQL: CROSS JOIN
      • give all unique combinations/configurations
    • Joins: R \(\bowtie\) S generates all tuples that have a common value, in SQL: NATURAL JOIN
      • similar to difference except attributes can be different/not shared between both relations

His observations on performance/declarative way of doing queries are interesting, which lead into the reasons for SQL existing how it will generally be preferred to just do high level queries "Give me this answer" and let the dbms figure out the most optimized way to do so instead of giving a specific sequence of instructions.

18.1.1 Slides from Relational Databases: SQL from 6.170

We can review the slides from MIT's 6.170 on SQL as they will make sense now.

  • Relation: table
  • Attributes: the table columns
  • Tuple: the table rows
  • Types: attributes (columns) enforce a specific type for all tuples in the relation, except NULL which we already know from the CMU lecture, is a special value that can be found in any tuple
  • Key: attribute, or set of attributes (remember Author(id, name, year, country)) who's value is unique for every tuple.
    • some SQL examples are shown

If you want a full introduction to SQL, the book for the CMU course Database System Concepts covers it in chapter 3 through 5, and can be found on WorldCat (library near you) or Library Genesis as a pdf. Note Library Genesis has numerous different domains to avoid censorship. The simple SQL example templates shown in MIT's 6.170 slides above are basically SQL in a nutshell: SELECT column FROM table WHERE (filter condition). Using commas you can include numerous attributes such as SELECT atr1, atr2, FROM rel1, rel2 where (condition AND condition OR condition).

18.2 CMU 15-445 Advanced SQL

  • YouTube - Advanced SQL (Same 2019 playlist)

Let's learn something more than basic SQL, and see how doing simple things are difficult since no dbms follows the 1992 SQL standard, it's much like the HTML, CSS and DOM standards we looked at where every browser has a different implementation.

  • Query optimizing is shilled as a high demand skill
  • SQL name exists because IBM was sued for calling their prototype 'SEQUEL'
    • Structured English Query Language
  • Current standard: SQL:2016
    • nobody actually follows the standard, except SQL-92 standard
  • SQL is a collection of relational languages
    • DML: Data Manipulation Lang: insert/update/select
    • DDL: Data Definition Lang: define schemas
    • DCL: Data Control Lang: security auth
  • Important: SQL is based on 'bag algebra'. No ordering, allows for duplicates.
  • Aggregates: take a bag of tuples, return a single value
  • ex: AVG (average), MIN, MAX, SUM, COUNT = the # of values
  • Aggregate functions can only be used in the SELECT (filter) output list since you need to select first before running functions on said selection
  • GROUP BY: Extract information about the aggregates we're computing
  • HAVING: filters results from aggregates ie: only show me gpa > 3.0 whereas the aggregate is calculating the avg of all gpas
  • Strings:
  • Single quotes, usually case sensitive (except MySQL)
  • LIKE is used for string matching '%' matches any substring (including empty), underscore matches any character
  • SQL-92 standard defines numerous other string functions, operations like || (cat)
    • Postgres and Oracle most closely follow the SQL-92 standard
  • Output Control
  • We can dump the output to another table, if that table doesn't exist then our dbms will create it for us
    • again it's a declarative language, you tell it what you want, the system does it for you
  • We can limit the tuple output, or order it with ORDER BY

Now we are covering what Prof Pavlo considers advanced SQL, such as nested queries. Everything in this lecture is documented in standards and dbms documentation, we'll just look at the high level overview for these queries. Have the slides open as you watch the lecture, pause to understand them then continue as he races through these topics in the lectures.

-Nested Queries

  • queries inside of queries, take the output of one query use it as the input to another
    • query optimizers may rewrite the example query as a join
  • ALL: every tuple must satisfy expression
  • ANY: must satisfy for at least one row
  • IN: equivalent to '=ANY()' (is there any tuple that equals my predicate)
  • EXISTS: at least one row is returned
    • can be combined w/boolean: NOT EXISTS (nothing is returned)
  • Can only reference outer query from inside query, not other way around

Numerous examples of nested queries are shown, and different ways to rewrite them. The high level as already stated you're taking the output of one query and using it as input to another.

-Window Functions

  • Computing some function on tuples, like Aggregates but returns the tuple(s) and the value
  • SELECT … FUNC-NAME(..) OVER(…)
  • OVER() clause specifies how to slice up data (or sort it), can also leave blank
  • PARITION BY to specify how to group things, OVER(PARTITION BY ..)
  • OVER (ORDER BY ..) sorting results
    • RANK() produces the rank of the sort order

He goes over examples of this, you can open up a local sqlite3 or postgres install and try these queries yourself.

18.2.1 Common Table Expressions

The last example are Common Table Expressions. I made this it's own sub-heading because you'll definitely run into CTEs in the wild. The ideal scenario is to strive to compute your answer as a single SQL statement as opposed to multiple statements, because then the dbms can do optimizations behind the scenes for you. With CTEs you can declare 'here's everything I want to do' in a single statement. He also gives an example of recursive CTEs.

WITH cteName AS (
  SELECT 1
)
SELECT * FROM cteName

This creates a temporary table that the SELECT below the 'WITH .. AS' can reference like it was an already existing table.

18.3 CMU 15-445 Normal Forms

In later versions of 15-445, Prof Pavlo eliminated the lecture on normal forms entirely, mainly because 'people just wing it, and end up with 3rd Normal Form anyway".

YouTube - Normal Forms (2017 Playlist)

The first 10 minutes are a review of the prev lecture, which we skipped about lossless joins, dependency preservation, and avoiding redundancy while performing decomposition. They're good to know and I recommend it since you already know relational algebra, you'll easily understand it but for the sole purposes of MIT web.lab and 6.170 we're good with just a normal forms review.

  • There's generally 6 normal forms, most common is 3NF/Boyce-Codd Normal Form(BCNF).
  • 1NF (first normal form) is a single table with atomic attributes
  • The 4+ normal forms are rare, information theory forms
    • By default you usually end up in 3NF
  • 1NF: All types must be atomic, no repeating groups
    • can't have multiple values in columns
    • shouldn't repeat groups (customer-name1, customer-name2)
    • array types are actually valid, but from a theoretical standpoint this violates 1NF
  • 2NF: Everything must be atomic, without repeating groups (1NF), but non-key attributes must depend on the candidate key
    • Removing more redundancy
    • Considered 'good enough form'
  • BCNF and lossless joins will make no sense since we didn't cover superkeys by skipping lecture 4
    • 'decomposition' means splitting a relation (table) into other relations
    • There's problems with BCNF anyway and you don't want to use it
  • 3NF: Preserved dependencies but may have some redundant data
    • As an amateur you'll probably just naturally produce 3NF
    • Prof Pavlo claims nobody actually designs a db schema this way (calculating normal forms)
    • You end up thinking in terms of objects
      • NoSQL drop all data protections (no ACID)
    • He argues mongoDB is still 1NF but with an array
    • Document data model (NoSQL) blurs line between logical and physical layer
      • Problems: you end up now specifying the physical layout of data, no longer declaration model
      • All the problems of the 1970s (System R) returns that were talked about in prev lectures
      • System R was abandoned, too impossible to maintain

Writing your queries in a declarative way is the 'best of both worlds' because the dbms will make the most efficient decision. The takeaway from these lectures is: use Postgres (or sqlite3 latest version for small embedded dbms), it's the dbms that most conforms to the SQL standard, use CTEs, unless you have a specific model that requires the document data model (NoSQL).

18.3.1 Schema design from 6.170

Now that we know what normal forms are, let's review the 6.170 slides on Normalization and Constraints.

  • Seperation of concerns: make columns atomic
  • 3NF: Can the value of a column be derived/described, from another column? If so you violate 3NF
    • Move these into new relations
    • Can you 'over normalize'? Yes

18.4 SQL Exercises from 15-445

Let's try the sqlite3 assignment from this homework. Note make sure you get the latest sqlite3 version so you can use features such as window functions.

  • Q2: List the longest title of each type, along with the running time, use the primary title ASC as a tie-breaker.
  • We'll need a window function for this, since we want to rank, that way we get back all rankings that are 1 for each type
SELECT type, primary_title, runtime_minutes from ( 
 SELECT *, RANK() OVER(PARTITION BY type ORDER BY runtime_minutes DESC, primary_title ASC) 
  AS rank FROM titles) AS ranking WHERE ranking.rank = 1;

---
movie|Logistics|51420
short|Kuriocity|461
tvEpisode|Téléthon 2012|1800
tvMiniSeries|Kôya no yôjinbô|1755
tvMovie|ArtQuench Presents Spirit Art|2112
tvSeries|The Sharing Circle|8400
tvShort|Paul McCartney Backstage at Super Bowl XXXIX|60
tvSpecial|Katy Perry Live: Witness World Wide|5760
video|Midnight Movie Madness: 50 Movie Mega Pack|5135
videoGame|Flushy Fish VR: Just Squidding Around|1500

You don't have to write sql in all caps, it's just so the reader can understand more clearly what's going on. Try removing the primary title ASC out of the PARTITION BY and change ranking.rank to 2. Notice there are a lot of tvEpisodes rows returned all with 720mins running time. A tie-breaker is what you decide should be returned in a situation like this, and we were asked to use ascending primary title as tie breaker meaning alphabetical sort, choose the first result.

  • Q3: Print type and number of associated titles, sort by number of titles in ascending order
  • Remember aggregates from the slides
SELECT type, COUNT(type) AS cnt FROM titles GROUP BY type ORDER BY cnt ASC;

---
tvShort|4075
videoGame|9044
tvSpecial|9107
tvMiniSeries|10291
tvMovie|45431
tvSeries|63631
video|90069
movie|197957
short|262038
tvEpisode|1603076

  • Q4: Print all decades and number of titles, constructing a string that looks like '2010s'. Sort in desc order with respect to number of titles.
  • This is again right out of the slides and teaches you that you can group results by almost anything
    • sqlite uses it's own syntax for substring which you have to look at the documentation for
SELECT SUBSTR(premiered, 1,3) || '0s' AS decade, COUNT(premiered) AS cnt FROM titles WHERE premiered NOT null GROUP BY decade ORDER BY cnt DESC;

---
2010s|1050732
2000s|494639
1990s|211453
1980s|119258
1970s|99707
1960s|75237
1950s|39554
1910s|26596
1920s|13153
1930s|11492
1940s|10011
1900s|9586
2020s|2492
1890s|2286
1880s|22
1870s|1

As the date of this writing (Sep 8) I just helped finish somebody's homework since the solutions haven't been posted yet, though mine seems horribly inefficient and you could also do Q2 with JOIN. Working through and figuring out these few exercises will definitely help you learn the basics of SQL. The rest of that course can be completed if you take something like CMU's 15-213 or MIT 6.004 to learn about mmap, OS signals, etc.

19 Guest Lecture: MIT ODL

  • YouTube - IAP 2019 Day 5 - Emerging Opportunities/Trends in Ed-Tech

After detouring through SQL we're back at MIT's web lab. This is a guest lecture about ed-tech that's optional. Interesting: MITx MicroMasters program in Data Science helps you with admittance to their PhD SES program (you still need an undergrad unfortunately). The fastest and cheapest undergrad you can probably do which has regional accreditation (national accreditation usually doesn't qualify) is likely WGU, as you can audit the courses as fast as you want. However it's $7k USD/yr so you are paying ~$28k total for a piece of paper while essentially teaching yourself everything, with the added bonus that you can work F/T and still complete the courses.

Some other MIT specific things are talked about, like the firehose guide which can help you plan course loads. If you don't know already MIT has numerous free courses online for no credit self learning or paid for edx.org credentials.

20 Creating a Wireframe

I just went through the slides since I've already audited this open CMU course with lectures/slides which will teach you the basics of wireframing and human computer interaction. It's a bit overkill with topics like 'Contextual Inquiry Analysis' but much of the info in the slides is excellent, like designing for international audiences and exact instructions how to use tools like Balsamiq/InVision/PhotoShop though as per web.lab you can just do this with pencil/paper.

My criteria for design is not to use those awful bootstrap tablet style home pages or infinity scroll. In general I follow the advice of government sites which have so-called design systems (Australia and NZ have these too) that have put actual research into their designs over an entire population. Common sense design advice there like using breadcrumbs, avoiding terrible, terrible infinity scroll, making everything dead simple to find etc. The UK design system is open licensed last time I checked, and is straight forward ExpressJS/Node with their copy-paste CSS and front end features for assistive technology support (JAWS, ZoomText, NVDA etc), plus they maintain it so you don't have to.

They include a prototyping kit though you would obviously want to change colors to avoid looking exactly like GOV.UK and you would want to design your overall site API so a real designer could easily use it should you wish a unique looking site that also has a fallback capability for IE 11 screen readers.

21 Workshop 4: Review and Accounts and Authentication

  • YouTube - IAP 2019 Day 5 - Review and Authentication

This workshop is 2 hours long, but keep in mind a lot of that is filmed breaks, and students working on things you can skip if you've already completed the task.

  • Starts with a summary of what we've already done
  • They walk through comment posting, what the code does
    • 'this' keyword is breifly described, as written before it is explained in this book though if you write in a functional style, you will never need to use this aware functions.
  • The authentication lectures start at 48:00
  • MIT 6.858 is recommended which we'll do when we cover 6.170 security lectures

The remainder of this workshop is the standard git reset hard and checkout of various stages, adding passport to catbook and describing what the passport code does, describing express sessions. If you still don't understand req/res then read the Express documentation.

btw, we're officially done the first week. There's only 1 day of workshop left on React, some advice/guest lectures, plus some security reading and other optional lectures from 6.170

22 Modernizing Catbook w/Fetch and Socket.io

This is a lecture, not a workshop, that explains the basics of web sockets, promises, mapping over arrays. Most of this we've already done.

  • Socket.io is introduced
    • You probably would just want to use standard web sockets since support is now universal across browsers
    • Debugging socket.io bugs is a challenge at best, because bugs only show up on some browsers/connections due to the way it's designed
    • There's many other web socket APIs
    • For the purposes of banging out a MVP/prototype which is what this workshop is for it then socket.io is fine
  • Promises are introduced, which we already know from 6.170
  • fetch introduced, api.js rewritten to use promises
  • feed.js is rewritten using Array.map()/promises

23 How to actually design a site from scratch

  • YouTube - IAP 2019 Day 6 - How to Design a Web App

A lecture on how to put it all together yourself. The MIT 6.170 course has multiple recorded lectures available to the public about seperation of concerns and abstracting a data model which we'll do after the React lectures from weblab.

  • Sketch/wirereframe your front page how you'd like it to look
  • From the sketch list the features you want on paper
  • Create "endpoints" (routes) on paper
  • Split up features as much as possible into components
  • What data do you need to store? Draft a schema
    • Rest of this lecture is competition specific advice, splitting up teamwork

The following is from The Art of Computer Programming and describes the Dijsktra/Knuth method of designing on paper, seperating out as much as possible things which can be self contained and reusable, and starting by building from the bottom up, then scrapping said paper and redesigning all over again. The main takeaway from here is that you really don't know what you need to do until you start to build. Eventually you will reach a point where your original spec needs to be completely redone, and this paper process helps you figure that out before you code thousands of lines and realize there's some sort of critical fault with your design. Note, in the industry, they have more modern models for this but as a solo programmer this is the best method I've found.

We conclude this section by discussing briefly how we might go about writing a complex and lengthy program. How can we decide what kind of subroutines we will need, and what calling sequences should be used? One successful way to determine this is to use an iterative procedure:

Step 0 (Initial idea). First we decide vaguely upon the general plan of attack that the program will use.

Step 1 (A rough sketch of the program). We start now by writing the "outer levels" of the program, in any convenient language. A somewhat systematic way to go about this has been described very nicely by E. W. Dijkstra, Structured Programming (Academic Press, 1972), Chapter 1, and by N. Wirth, CACM 14 (1971), 221-227. We may begin by breaking the whole program into a small number of pieces, which might be thought of temporarily as subroutines, although they are called only once. These pieces are successively refined into smaller and smaller parts, having correspondingly simpler jobs to do. Whenever some computational task arises that seems likely to occur elsewhere or that has already occurred elsewhere, we define a subroutine (a real one) to do that job. We do not write the subroutine at this point; we continue writing the main program, assuming that the subroutine has performed its task. Finally, when the main program has been sketched, we tackle the subroutines in turn, trying to take the most complex subroutines first and then their sub-subroutines, etc. In this manner we will come up with a list of subroutines. The actual function of each subroutine has probably already changed several times, so that the first parts of our sketch will by now be incorrect; but that is no problem, it is merely a sketch. For each subroutine we now have a reasonably good idea about how it will be called and how general-purpose it should be. It usually pays to extend the generality of each subroutine a little.

Step 2 (First working program). This step goes in the opposite direction from step 1. We now write in computer language, say MIXAL or PL/MIX or a higher-level language; we start this time with the lowest level subroutines, and do the main program last. As far as possible, we try never to write any instructions that call a subroutine before the subroutine itself has been coded. (In step 1, we tried the opposite, never considering a subroutine until all of its calls had been written.) As more and more subroutines are written during this process, our confidence gradually grows, since we are continually extending the power of the machine we are programming. After an individual subroutine is coded, we should immediately prepare a complete description of what it does, and what its calling sequences are (…).

Step 3 (Reexamination). The result of step 2 should be very nearly a working program, but it may be possible to improve on it. A good way is to reverse direction again, studying for each subroutine all of the calls made on it. It may well be that the subroutine should be enlarged to do some of the more common things that are always done by the outside routine just before or after it uses the subroutine. Perhaps several subroutines should be merged into one; or perhaps a subroutine is called only once and should not be a subroutine at all. (Perhaps a subroutine is never called and can be dispensed with entirely.) At this point, it is often a good idea to scrap everything and start over again at step 1! This is not intended to be a facetious remark; the time spent in getting this far has not been wasted, for we have learned a great deal about the problem. With hindsight, we will probably have discovered several improvements that could be made to the program's overall organization. There's no reason to be afraid to go back to step 1 - it will be much easier to go through steps 2 and 3 again, now that a similar program has been done already. Moreover, we will quite probably save as much debugging time later on as it will take to rewrite everything. Some of the best computer programs ever written owe much of their success to the fact that all the work was unintentionally lost, at about this stage, and the authors had to begin again. On the other hand, there is probably never a point when a complex computer program cannot be improved somehow, so steps 1 and 2 should not be repeated indefinitely. When significant improvement can clearly be made, it is well worth the additional time required to start over, but eventually a point of diminishing returns is reached.

Step 4 (Debugging). After a final polishing of the program, including perhaps the allocation of storage and other last-minute details, it is time to look at it in still another direction from the three that were used in steps 1, 2, and 3 - now we study the program in the order in which the computer will perform it. This may be done by hand or, of course, by machine. The author has found it quite helpful at this point to make use of system routines that trace each instruction the first two times it is executed; it is important to rethink the ideas underlying the program and to check that everything is actually taking place as expected. Debugging is an art that needs much further study, and the way to approach it is highly dependent on the facilities available at each computer installation. A good start towards effective debugging is often the preparation of appropriate test data. The most effective debugging techniques seem to be those that are designed and built into the program itself - many of today's best programmers will devote nearly half of their programs to facilitating the debugging process in the other half; the first half, which usually consists of fairly straightforward routines that display relevant information in a readable format, will eventually be thrown away, but the net result is a surprising gain in productivity.

Another good debugging practice is to keep a record of every mistake made. Even though this will probably be quite embarrassing, such information is invaluable to anyone doing research on the debugging problem, and it will also help you learn how to reduce the number of future errors. Note: The author wrote most of the preceding comments in 1964, after he had successfully completed several medium-sized software projects but before he had developed a mature programming style. Later, during the 1980s, he learned that an additional technique, called structured documentation or literate programming, is probably even more important. A summary of his current beliefs about the best way to write programs of all kinds appears in the book Literate Programming (Cambridge Univ. Press, first published in 1992). Incidentally, Chapter 11 of that book contains a detailed record of all bugs removed from the tex program during the period 1978-1991. - DEK The Art Of Computer Programming Vol 1., 1.4.1 Subroutines

That last paragraph, it should be known that what Knuth considers 'medium-sized software projects' was actually designing entire compilers from scratch for dozens of various architectures. He also rewrote both IT (internal translator) and SOAP (symbolic optimum assembly program) in his sophomore year, and the user guide he wrote for his own program ended up being the textbook in a graduate class he later took at university.

Knuth also made a note about confidence, he claimed that in the beginning he really had no confidence he could program a large program and by seperating out concerns and building from the easiest parts up the tree he gained confidence that what he was trying to make could actually be built. As for Literate Programming, there is an interesting post here about it's modern use, as well as the book Physically Based Rendering by Pharr, Jakob & Humphreys.

24 Reactive Programming from 6.170

This is our segway into learning React. Slides example the principles of functional programming: first-class functions, declarative, (controlled) side effects, functions that return the same value for the same arguments, and immutable objects.

24.1 Exercises

Convert fireSale to a declarative function:

let toys = [
{name: "Woody", price: 10},
{name: "Buzz", price: 15},
{name: "Rex", price: 3},
{name: "Slinky", price: 7}];

function fireSale(minPrice, perc) { 
  return toys
  .filter(function(item) {return item.price > minPrice; })
  .map(function(item) {return {name: item.name, price: item.price * perc }})
}

Event streams treat events as a continuous stream of data rather than one-off occurrences that must be captured.

  • filtering events based on their properties
  • throttling the freq of events in a stream
  • merging individual streams into a single stream
    • example of lightbulb state changing is given, to encapsulate in a pure function

24.2 Reactive Programming: Vue.js from 6.170

Brief slides of an example of the model-view-viewmodel. The recitation github for Vue requires a school login, so if you want to learn more about Vue there's this. It definitely is a smaller surface area to learn and the community is large.

25 React workshop

  • YouTube IAP 2019 Day 6 - React Workshop

Lecture starts out attempting (again) to define this keyword in JS, as noted previously the You Don't know JS book is likely the best resource to understand this. The arrow function is used as a workaround, remember event handing gotchas from Event exercises? There was an example there of an arrow function in order to update a counter as they can use variables defined in the same function/class. That's basically what the lecturer here is trying to convey, that instead of using this you can drop an arrow function instead.

  • React JS is a 'front end library to build user interfaces'
  • State is owned by the component it's in, you can't directly edit state from another component
  • Properties are passed in from parents down the component tree
  • A given state has to live at least as high as the parent component that uses that state
    • Let's look at the brief slides from 6.170 about React to clear up what states/props are, and to clear up what componentDidMount() does
    • We can't do the rest of the 6.170 workshop since you need credentials for their github

We're back to the weblab workshop which begins around 36:00, we are cloning this repository. Note, the student who runs the workshop is a U.S. Collegiate Top 5 competitive figure skater as well as F/T MIT engineering student, so if she has time to learn this curriculum, so do you.

  • Webpack hot module replacement is explained here as it was glossed over in the lecture
    • never use this in production for obvious security reasons
  • Git checkout stephalf will overwrite to their greetings, but you can still see how it hot reloads any new changes(alter and save app.js)
  • 1:22:00 he comments on the gotcha in the 6.170 slides about ClassName vs Class in the html (you have to use ClassName)
  • this.setState() from 6.170 slides also covered since you can't directly edit state from another component

The rest of the workshop is the standard git reset –hard, git checkout <stages>. There are numerous books on React, and another workshop after this if you're lost.

26 GraphQL

We skipped the Facebook GraphQL lecture because there are no notes/recordings. It's a query language with syntax similar to JSON with a detailed spec. A good free book about it is The Road to GraphQL which is available here. The same author wrote The Road to React and is also free as noted in the previous section. If you did those SQL lectures you can figure out GraphQL.

27 Web App Security from 6.170

The definitive guide for these things is The Web Application Hacker's Handbook and the book The Tangled Web: A Guide to Security Modern Web Applications. The first book you get a free copy of burpsuite, an old version of Wordpress or something, and you do the book by working through it and attacking your sample server (which you should probably put behind credentials/firewall so bots don't automatically attack it). The second book will teach you about how there's even risks in CSS parsing, the many pitfalls of generating HTML, cache poisoning and other problems you should be aware of.

27.1 How to attack systems

Some slides from 6.170 on how to attack systems and company protocols.

  • Social engineering is also problem for domain management or hosting where somebody can claim to be you and reset credentials
    • Attaching a phone number as Two-Factor ID makes this easy to do via phone number spoofing, don't use phone numbers
  • The evil 2018 World Cup chrome addon, in general browser addons are dangerous because they can be sold to adtech without you knowing and turn evil silently pushing obfuscated updates
  • Classic phishing attacks talked about, such as breaking into RSA to attack Lockheed Martin
    • The introduction of Unicode in domain names opened up the web to phishing attacks, where you are fooled into using similar looking yet attacker run Signal/Telegram servers
  • Backdooring the development toolchain happened recently with NPM package malware being distributed worldwide

Really the best defenses for employee protection are operating systems like QubesOS, where a VM is spun up in isolation everytime you load a browser or email app in order to prevent a phishing or other attack from getting the goods like VPN sign-in credentials on a developer's notebook sitting in their home directory. For 'trusting trust' manual updates to your packages and libraries (use lockfiles) in your web app not blindly running npm. Of course this is a double edged sword with the more security you add, the more complexity, the more opportunities for insecurity.

27.2 How to attack a web app

Another set of slides from 6.170 and why it's often easier to use a framework than roll your own, as you need to mitigate and be aware of the numerous classic web attacks.

  • Advice from the slides:
    • sanitize and escape all inputs
    • limit sessions, use multi-level sessions for critical requests like password changing (most sites req you to enter the old pw to create a new one)
    • assume all requests can be out-of-order and replayed/modified, check every request
      • UDP protocol is a good example, if attacker can reply faster, they win (DNS is UDP)
    • encrypt cookies, use TLS
    • use prepared statements as in the db.query() example
    • form tokens to defeat CSRF
    • don't work around same origin policy and defeat your own mitigations

The slides recommend OWASP but it's a seriously disorganized, corporate driven org where most of the advice is either incomplete or outdated. Some of the security cheat sheets are OK such as the Top 10 but you'll be lost if you look into any of their articles on security. The book The Tangled Web ends each chapter with a checklist to follow if you're building an application and is probably still the best resource for timeless general advice, which mitigates some of these attacks. There's plenty of security-as-a-service startups around too with their own checklists and web app firewalls.

Some of the security slides are covered in the next lecture on Abstract Data Design.

27.3 Security training

In addition to the books mentioned, TrailofBits has a good list of topics such as lectures on web hacking, source code auditing.

27.3.1 Reputation, regulatory and maniac proofing

Not covered anywhere is security of your reputation from fake reviews, overzelous regulators or maniacs who may target you.

Reputation security can be defined as protecting yourself from fake reviews on various ratings sites and Google, and protection from negative PR being spread about you on social media. This is something to think about when designing your site/service: how will you defeat an organized attempt (likely by your competitors) from attempting to damage your reputation. You will want to be proactive here to not let an attack continue for too long where there are hundreds of bad reviews and PR floating around before you've noticed. Think and research about ways to counter this before it happens, such as proof of order, or other measures to link real customer to real transaction but honestly I don't have an answer for this, it's just something you will have to come up with a strategy for. For example look at some of the Google Playstore reviews, there is salon software where a large percentage of bad reviews are being angry against some hairdresser that's using it, nothing to do with the software, driving down the reputation of that app with no response from the developers. Unless you actually went through numerous reviews you'd never know they weren't related to the software. Even worse, the only way to get these ratings outfits to pay attention to your appeals to remove fake reviews (your emails just go to an unread inbox) is to use Twitter, such as attempting to remove bad Google reviews.

Regulatory security is protection from somebody (again, usually your competition) either patent troll suing you, or making complaints to some authority in order to drain you of resources. Whatever service or product you're creating you should consider avenues of attack from regulators. These can also be sales tax attacks, if you aren't charging the proper sales tax for specific regions and your competitor has their lawyer put in complaints you could get fined. This is why there is even sales tax as a service APIs. Strategize ways to avoid possible regulator attacks by at least being aware of them and ways you may be targeted. For example if you have a coin to coin exchange, and you accept customers from a country that requires all kinds of banking regulations you're at risk to regulatory attacks. You may even have a domain where TLD owners have restrictions. My only advice is to think like an adversary, how could a team of lawyers hired by your competitors after you become successful enough to be noticed, use the legal system to destroy your internet presence, or locally incorporated business that is processing payments. To avoid patent trolls, incorporate a limited liability company and do some kind of research into the design of your site. For example you probably don't know that there's an active patent troll who sues exchange websites that have a design where the highest bid converges with the highest ask in a single column. This is why you always see these seperated.

Finally, maniac proofing. These people can come from anywhere (evil competitor, random nujob, political enemy) but their irrational vendetta against you and your service or site is typically triggered on social media. Avoid any kind of political signaling or airing of dirty company laundry on social media, such as fighting with a user on a review site. You're just inviting a maniac to be your adversary. In fact I would avoid having any kind of public personal social media accounts and using the 'seperation of concerns' philosophy to just use a specific account for specific things, such as helpdesk or outage notifications, marketing, etc. Keep your personal life personal. Ways to design around maniacs, think about petty sabotage. For example we'll use salon/barber software again. Suppose you're writing an app for appointments or reservations. Do you have a plan to mitigate somebody making a hundred fake reservations everyday? Filling the support email with garbage? Doing dozens of chargebacks? Have you even considered how you will mitigate chargebacks? Is there a social feature such as comments allowed on your site, will you shadowban or outright ban? Shadowban being the angry troll is allowed to continue posting but their comments and posts only show on their end to fool them into thinking they're just being ignored so they go away eventually to provoke somebody else. An outright ban you are letting them know they need to go and use VPNs and other workarounds to continue shitposting on your website. The only advice I can think of is as you build something architect your application so sabotaging it cannot be easy to do, such as allowing somebody to fill a day's appointments with fake clients. This doesn't mean you need to be paranoid, just make it hard for a single troll to kill your app.

28 Abstract Data Design (Part 1) from 6.170