Web programming basics
HTML <script> tags
The <script>
tag includes the JS in the HTML page so the browser can execute it.
- JS can be inline between the
<script>
and</script>
tags - More common to use the
src
attribute in the<script>
tag to specify the URL of a JS file. Still requires the</script>
closing tag- Separates content and behavior
- Create a single JS page that can be reused for multiple HTML files
- JS file is only downloaded once and subsequently can be retreived from the cache
src
takes an arbitrary URL - you can import JS code that is exported by other web servers
- JS files use
.js
extension
type
attribute
There are two reasons to use the type
attribute:
- specify the script as a module
- embed data into the web page without displaying it
Historically, people used type="application/javascript"
, but that has been deprecated for a long time.
Modules
If you are using modules and NOT using a code bundler that combines all modules into a single JS file, you have to load the top-level JS file with the type=module
attribute:
<script src="index.js" type="module"></script>
This loads the top-level module, which loads all dependent modules.
async
and defer
When the HTML parser encounters a <script>
tag, it has to run the script before it continues to make sure it doesn’t output any HTML
- Called synchronous or blocking script execution
- This is because when JS first came out, there was no API for manipulating the DOM - JS had to generate content while the document was loading, using the
document.write()
method to inject text into the HTML doc. document.write()
is bad style- This slows down page loads
Can include boolean defer
or async
attributes:
<script defer src="index.js"></script>
<script async src="index.js"></script>
These attributes tell the browser that the HTML document does not use document.write()
, so it can continue parsing the HTML doc:
defer
: Execute the script after the document is fully loaded and parsed and is ready to be manipulatedasync
: Execute the script ASAP, but do not block parsing to download the script- If both are present,
async
takes precedence overdefer
- Scripts run in the order that they are listed in the HTML doc, so async scripts might run out of order
- Module scripts:
- run as
defer
, by default - If you add
async
, the code executes as soon as the module and its dependencies are loaded
- run as
- You can just put the
<script>
tag at the end of the doc to get the same behavior asdefer
andasync
- Use these if you have to load scripts in the head
DOM
Document Object Model is the API for working with the Document object
- Document object represents the HTML doc displayed in the browser
- HTML documents contain nested HTML elements formed in a tree
- DOM API mirrors tree structure of HTML doc
- For each HTML element, there is a JS object
- For each string of text, there is a text object
- These elements are classes, and they are subclasses of the Node class
- JS can query and traverse the Nodes with the DOM API
- Node trees use familial language - parent, child, sibling, descendant, ancestors
- DOM API can create Element and Text nodes and insert them in relation to other Element objects
- There is a JS class for each HTML tag type
- Ex: HTMLBodyElement class, HTMLTableElement class
- There is an instance of the JS class - called a JS element object - for each occurrence of the HTML tag in the document
- Each JS element object has properties that correspond to HTML tag attributes
- Some JS classes define attributes that are not available on the HTML tag
Global object
There is one global object per browser or tab, and all JS code running in the window or tab shares the same global object:
- JS’s standard library is defined on the global object, and it is the entrypoint for some web APIs, such as
document
andfetch()
- In web browsers, the global object is also the
window
object, which represents the current web browser window- Best practice to use
window.
prefix when calling the global object. Ex:window.innerWidth
- Best practice to use
Namespaces
Modules: constants, variables, funcs, and classes defined in a module are private and need to be explicitly exported, and then can be imported by another module
Non-modules: All scripts share a namespace and can share vars, funcs, etc.
- Be careful with naming conflicts
var
andfunction
declarations create shared global object properties. This means that you can invoke them withwindow.<function>
- ES6
let
,const
, andclass
do not create properties on the global object, but still be mindful of namespaces
Program execution
A JS program is all JS code in or referenced from a document that shares a global Window object
- non-module scripts also share a top-level namespace
- an
<iframe>
has a different global Window object and Document object, so its a separate program- If the container and contained document are on the same server, they can communicate with each other
First phase
Load JS content:
- Document content is loaded
- This stage should take less than a second
- Code in inline and external
<script>
elements are run in the order they appear in the document, taking into accountdefer
andasync
attributes.- Each script is run from top to bottom
- Some scripts just define functions and classes for the second phase
- Ex: Register event handlers or callbacks
Detailed breakdown:
- Browser creates a Document object and parses the web page. Adds Element objects and Text nodes as it parses the HTML.
document.readyState
isloading
- HTML parser adds to the document any
<script>
tags withoutdefer
,async
, or modules. Can usedocument.write()
to maniupulate the DOM, but these scripts generally just register event handlers - If the HTML parser encounters an
async
<script>
tag, it downloads the script and continues parsing the document. Do not use thedocument.write()
method with this event document.readyState
changes tointeractive
defer
scripts are executed in the order they are encountered in the document. They have access to the complete document - but do NOT usedocument.write()
. Async scripts might also be executed.DOMContentLoaded
event is fired on the Document object. This begins the transition to phase 2.async
scripts might still be executed.- Document is completely loaded, but might be waiting on images or other content. After all content is loaded and
async
scripts are loadeddocument.readyState
is changed tocomplete
and the browser fires aload
event on the Window object. - Completely in second phase, event handlers are invoked asynchronously.
Second phase
Asychronous and event-driven:
- In response to events, the browser executes event handlers and callbacks that were registed in the first phase
- This phase lasts as long as the document is displayed in the browser
- Event examples:
- mouse clicks, keystrokes
- network activity
- document resource loading
- elapsed time
- errors in JS code
First events to occur are DOMContentLoaded
and ’load’ events:
- These events are used as a trigger or starting signal for JS actions like registering handlers on the
load
event.
Event | When It Fires | Use Case |
---|---|---|
DOMContentLoaded | After HTML is parsed, before full page load | Run JS that doesn’t depend on images or CSS |
load | After the entire page (CSS, images, etc.) loads | Initialize app after all resources load |
pageshow | Similar to load , also fires on back/forward cache | Detect when page is restored from cache |
beforeunload | When the user is about to leave | Show warnings or save data |
unload | When the page is closing | Clean up resources (e.g., logs, API calls) |
visibilitychange | When the page is hidden or visible | Pause/resume background tasks |
window.addEventListener('DOMContentLoaded', () => alert('DOMContentLoaded'));
document.addEventListener('DOMContentLoaded', () => alert('DOMContentLoaded'));
// document doesn't have the `load` event, only window does
window.addEventListener('load', () => alert('DOMContentLoaded'));
Threading model
JS is single-threaded:
- there are no locks, deadlocks, race conditions
- no two event handlers can execute at the same time
- the browser does not respond to user input when scripts and event handlers are executing, so you can’t write code that runs too long
Web worker - a controlled form of concurrency
- background thread that can perform tasks without freezing the UI
- can’t access the document, does not share its state with other workers or main thread
- communicates with the main thread and other workers through asynchronous message events
Input/Output
JS takes the following inputs:
- Document, which JS accesses with the DOM API
- User input - mouse clicks, keyboard, text, etc
- URL of document being displayed, avaialble with
document.URL
- HTTP cookie req header with
document.cookie
. Cookies are usually server side, but JS can read/write them in browser - Global
navigator
property - gives info about the web browser:navigator.userAgent
navigator.language
navigator.hardwareConcurrency
- Global
screen
property - info about user’s display sizescreen.width
screen.height
navigator
andscreen
objects are like env vars
Produces output:
- In the DOM
- In the console, but this is for debugging
Errors
JS programs don’t crash, they just don’t do what they’re supposed to and then log errors to the console:
- You can set a few properties on the
window
object to handle errors. Mostly useful for telemetry
// log errors and unexpected failures
window.onerror()
// when a Promise is rejected and there is no .catch()
window.onunhandledrejection()
window.addEventListener('unhandledrejection', function(e) {...})
Web security
Restrictions
- JS cannot read or write to the filesystem
- JS cannot access general-purpose networks. JS can only make HTTP reqs and use Websockets
Same-origin policy
A JS script can only read properties of windows and documents that share the same origin:
- origin - protocol, host, and port of the URL that loaded the document
- origin of the document that the script is embedded in, not the script itself
- If Host A serves a web page with a script loaded from Host B, then the script origin is Host A
- Different web server = different origin
- Different scheme = different origin
- Different port on same server = different origin
- iframes can’t read properties of the page hosting them
Applies to HTTP reqs too:
- By default, JS can make HTTP reqs to the web server that loaded the document, but cannot make reqs to other web servers unless you use CORS or set the
document.domain
document.domain
: when a site has multiple subdomains (docs.example.com, support.example.com, example.com), the different sites might need to access properties from other subdomains. A script with the docs.example.com origin can setdocument.domain
toexample.com
to access those files.- CORS: Cross-Origin Resource Sharing. Lets a server decide which origins they can serve.
- Adds
Origin
request header that lists origins they will support - Adds
Access-Control-Allow-Origin
response header
- Adds
Cross-site scripting
When an attacker injects HTML tags or scripts into your website:
- If you dynamically generate content based on user input, then you must sanitize it by removing any embedded HTML tags, or you are vulnerable
- Called ‘cross-site’ because more than one site is involved:
- The site that injects HTML might get users to click on something
- Then the site runs code from the malicious site. Then they can ready cookie info or track keystrokes, among other things
- Prevention options:
- Remove HTML tags from untrusted data before you create dynamic content
- Always display untrusted content in an iframe with the
sandbox
attribute set. This disables scripting and other things