Lecture 1
Introduction to Web
Technologies
• OVERVIEW OF SYLLABUS, KEY TERMS, CONCEPTS AND CLASS
POLICIES
Objective
By the end of this lecture, you should be able to:
a) Describe key concepts about Web Technologies,
b) Get familiar with Technologies for developing website,
c) Demonstrate the knowledge of URL, Protocols, etc.,
d) Differentiate between web browser and web server,
e) Differentiate between Internet and WWW.
What is a Web Page?
Text file written in a scripting language called HyperText Markup Language (HTML).
The HyperText Markup Language or HTML is the standard markup language for
designing website to be displayed in a web browser.
Other Technologies:
◦ Cascading Style Sheets (CSS)
◦ scripting languages such as JavaScript and PHP to make it efficient and user friendly.
1.1 Internet vs. Web
The Internet:
The InterNet is short for INTERconnected NETwork.
Network of networks with common address space, common name space and common communication
protocols.
provides the means to pass packets of data around the globe.
An inter-connected computer networks, linked by wires, cables, wireless connections, etc.
Web:
a collection of interconnected documents and other resources.
make finding and exchanging documents across the Internet easier.
The world wide web (WWW) is accessible via the Internet, as are many other services including email,
file sharing, etc.
1.1 How does the Internet Work? (Continue)
Through communication protocols
Protocol: a set of rules governing the format of data sent
a specification of how communication between two computers will be carried out
The Internet is a packet-switched network.
Use TCP/IP
In 1973 Vint Cerf and Bob Kahn created the TCP/IP communication protocols
IP (Internet Protocol: responsible for making sure the packets are sent to the right
destination.
defines the packets that carry blocks of data from one node to another
TCP (Transmission Control Protocol) and UDP (User Datagram Protocol): the protocols
by which one host sends data to another.
Other application protocols: DNS (Domain Name Service), SMTP (Simple Mail
Transmission Protocol), and FTP (File Transmission Protocol)
1.1 Internet Protocol (IP)
IP: A simple protocol for sending data between two computers.
IP defines how to address and route each packet to make sure it reaches the
right destination.
It is responsible for making sure the packets are sent to the right destination.
Each gateway computer on the network checks this IP address to determine
where to forward the message. each device has a 32-bit IP address written
as four 8-bit numbers (0-255)
Find out your internet IP address: [Link]
Find out your local IP address: In a terminal, type: ipconfig (Windows) or
ifconfig (Mac/Linux)
The Internet Protocol (IP)
The Internet authorities assign ranges of numbers to different
organizations
IP is responsible for moving packet of data from node to node
IP-based communication is unreliable
A packet contains information such as the data to be transferred, the
source and destination IP addresses, etc.
Packets are sent through different local network through gateways
A checksum is created to ensure the correctness of the data; corrupted
packets are discarded
Transmission Control Protocol (TCP)
TCP is a higher-level protocol that extends IP to provide additional
functionality: reliable communication
TCP adds support to detect errors or lost data and to trigger
retransmission until the data is correctly and completely received
Connection
Acknowledgment
TCP/IP Protocol Suites
HTTP, FTP, Telnet,
DNS, SMTP, etc.
TCP, UDP
IP (IPv4, IPv6)
1.1 A Brief Intro to the Internet
Origins
ARPAnet - late 1960s and early 1970s
Network reliability
For ARPA-funded research organizations
BITnet, CSnet - late 1970s & early 1980s
email and file transfer for other institutions
NSFnet - 1986
Originally for non-DOD funded places
Initially connected five supercomputer centers
By 1990, it had replaced ARPAnet for non-
military uses
Soon became the network for all (by the early 1990s)
NSFnet eventually became known as the Internet
1.1 Brief Intro to the Internet (continued)
Internet Protocol (IP) Addresses
◦ Every node has a unique numeric address
◦ Form: 32-bit binary number
◦ New standard, IPv6, has 128 bits (1998)
◦ Organizations are assigned groups of IPs for their computers
Domain names
◦ Form: [Link]-names
◦ First domain is the smallest; last is the largest
◦ Last domain specifies the type of organization
◦ Fully qualified domain name - the host name and all of the domain names
◦ DNS servers - convert fully qualified domain names to IPs
Problem: By the mid-1980s, several different protocols had been invented and were being used on the
Internet, all with different user interfaces (Telnet, FTP, Usenet, mailto)
1.2 The World-Wide Web
A possible solution to the proliferation of different protocols being used on the Internet
Origins
Tim Berners-Lee at CERN proposed the Web in 1989
Purpose: to allow scientists to have access to many databases of scientific work through their own
computers
Document form: hypertext
Pages? Documents? Resources?
We’ll call them documents
Hypermedia – more than just text – images, sound, etc.
Web or Internet?
The Web uses one of the protocols, http, that runs on the Internet--there are several others protocols
(telnet, mailto, etc.)
The World Wide Web (WWW)
WWW is a system of interlinked, hypertext documents that
runs over the Internet
Two types of software:
◦ Client: a system that wishes to access the information provided by
servers must run client software (e.g., web browser)
◦ Server: an internet-connected computer that wishes to provide
information to others must run server software
◦ Client and server applications communicate over the Internet by
following a protocol built on top of TCP/IP – HyperText Transport
Protocol (HTTP)
Basics of the WWW
WWW: a hypertext system that operates over the Internet
The core functionality of the Web is based on three standards:
1. Hypertext Transfer Protocol (HTTP): specifies how the browser and server send the information to each
other.
a format of information which allows one to move from one part of a document to another or from
one document to another through hyperlinks
2. Uniform Resource Locator (URL):specifies how each page of information is given a unique "address" at
which it can be found.
unique identifiers used to locate a particular resource on the network
3. Hyper Text Markup Language (HTML) : a method of encoding the information so it can be displayed on a
variety of devices.
defines the structure and content of hypertext documents
HTML was designed by Tim Berners-Lee with similar goals as Vannevar Bush futuristic device he
called a "Memex".
Beyond hypertext is hypermedia
1.3 Web Client / Browser
A browser is a client on the Web because it initiates the communication with a
server.
Web browsers initiate network communications with servers by sending them
URL
web browser: fetches/displays documents from web servers
• Mozilla Firefox
• Microsoft Internet Explorer (IE)
• Apple Safari
• Google Chrome
• Opera
◦ Netscape.
HTTP provides a standard form of communication between browsers and Web
servers
1.3 Web Client/Browser (Continue)
Makes HTTP requests on behalf of the user
◦ Reformat the URL entered as a valid HTTP request
◦ Use DNS to convert server’s host name to appropriate IP address
◦ Establish a TCP connection using the IP address
◦ Send HTTP request over the connection and wait for server’s response
◦ Display the document contained in the response
◦ If the document is not a plain-text document but instead is written in HTML, this involves
rendering the document (positioning text, graphics, creating table borders, using appropriate
fonts, etc.)
1.3 Web Browsers (Continue)
Mosaic - NCSA (Univ. of Illinois), in early 1993
First to use a GUI, led to explosion of Web use
Initially for X-Windows, under UNIX, but was ported to other platforms by late 1993
Browsers are clients - always initiate, servers react (although sometimes servers require
responses)
Most requests are for existing documents, using HyperText Transfer Protocol (HTTP)
But some requests are for program execution, with the output being returned as a
document
1.4 Web Servers
Provide responses to browser requests, either existing documents or dynamically built documents
Browser-server connection is now maintained through more than one request-response cycle
All communications between browsers and
servers use Hypertext Transfer Protocol (HTTP)
Web servers run as background processes in the operating system
Monitor a communications port on the host, accepting HTTP messages when they appear
All current Web servers came from either
1. The original from CERN
2. The second one, from NCSA
1.4 Web Servers - (continued)
Web servers have two main directories:
1. Document root (servable documents)
2. Server root (server system software)
Document root is accessed indirectly by clients
Its actual location is set by the server configuration file
Requests are mapped to the actual location
Virtual document trees
Virtual hosts
1.4 Web Servers - (continued)
Web servers now support other Internet protocols
Currently, the two most common server configurations are Apache running on Linux and
Microsoft’s IIS running on Windows.
Apache
(open source, fast, reliable)
Began as the NCSA server, httpd
Maintained by editing its configuration file
IIS
Maintained through a program with a GUI interface.
1.4 Web Servers - (continued)
Main functionalities:
◦ Server waits for connect requests
◦ When a connection request is received, the server creates a new process to handle this
connection
◦ The new process establishes the TCP connection and waits for HTTP requests
◦ The new process invokes software that maps the requested URL to a resource on the
server
◦ If the resource is a file, creates an HTTP response that contains the file in the body of the
response message
◦ If the resource is a program, runs the program, and returns the output
1.5 Uniform Resource Locators
General form:
scheme:object-address
The scheme is often a communications protocol, such as telnet, http, ftp, gopher, telnet, file,
mailto, and news
For the http protocol, the object-address is:
fully qualified domain name/doc path
For the file protocol, only the doc path is needed
Host name may include a port number, as in
zeppo:80 (80 is the default)
URLs cannot include spaces or any of a collection of other special characters ( semicolons, colons, ..)
The doc path may be abbreviated as a partial path
The rest is furnished by the server configuration
If the doc path ends with a slash, it means it is a directory
1.5 Uniform Resource Locators
General form:
scheme:object-address
When you enter the above URL into the browser, it would:
◦ ask the DNS server for the IP address of www..[Link]
◦ connect to that IP address at port 80
◦ ask the server to GET /info/regesstepp/[Link]
◦ display the resulting page on the screen
1.6 Multipurpose Internet Mail Extensions (MIME)
Originally developed for email
Used to specify to the browser the form of a file returned by the server (attached by the server to the
beginning of the document)
Type specifications
Form: type/subtype
Examples: text/plain, text/html, image/gif, image/jpeg
Server gets type from the requested file name’s suffix (.html implies text/html)
Browser gets the type explicitly from the server
Experimental types
Subtype begins with x-
e.g., video/x-msvideo
Experimental types require the server to send a helper application or plug-in so the browser can deal
with the file
1.7 The Hypertext Transfer Protocol (continued)
Four categories of header fields:
General, request, response, and entity
Common request fields:
Accept: text/plain
Accept: text/*
If-Modified_since: date
Common response fields:
Content-length: 488
Content-type: text/html
Can communicate with HTTP without a browser
telnet [Link] http
GET /user1 /[Link] HTTP/1.1
Host: [Link]
1.7 The HyperText Transfer Protocol (continued)
Response Phase
Form:
Status line
Response header fields Status code is a three-digit number; first digit specifies
blank line the general status
Response body 1 => Informational
2 => Success
3 => Redirection
Status line format: 4 => Client error
5 => Server error
HTTP version status code explanation
Example: HTTP/1.1 200 OK The header field, Content-type, is required
(Current version is 1.1)
1.7 The HyperText Transfer Protocol (continued)
An example of a complete response header:
HTTP/1.1 200 OK
Date: Sat, 25 July 2009 [Link] GMT
Server: Apache /2.2.3 (CentOS)
Last-modified: Tues, 18 May 2004 [Link] GMT
Etag: "1b48098-16a-3dab592dc9f80"
Accept-ranges: bytes
Content-length: 364
Connection: close
Content-type: text/html, charset=UTF-8
Both request headers and response headers must be followed by a blank line
1.8 Security
There are many kinds of security problems with the Internet and the Web
One fundamental problem is getting data between a browser and a server without it being
intercepted or corrupted in the process.
Security issues for a communication between a browser and a server:
1. Privacy
2. Integrity
3. Authentication
4. Nonrepudiation
Destruction of data on computers connected to the Internet -
Viruses and worms
Denial-of-Service (DoS)
Created by flooding a Web server with requests
1.8 Security (continued)
The basic tool to support privacy and integrity is encryption
Originally, a single key was used for both encryption and decryption, which requires the
sender of an encrypted document to somehow transmit the key to the receiver.
Solution: (1976, Diffie and Hellman)
Public-key encryption
Use a public/private key pair
Everyone uses your public key to encrypt messages sent to you
You decrypt them with your matching private key -
It works because it is virtually impossible to compute the private key from a given public key
RSA is the most widely used public-key algorithm
1.9 The Web Programmer’s Toolbox (continued)
HTML: HyperText Markup Language
◦ It is a text file containing small markup tags that tell the Web browser how to display
the page
XHTML: eXtensible HyperText Markup Language
◦ It is identical to HTML 4.01
◦ It is a stricter and cleaner version of HTML
CSS stands for Cascading Style Sheets
◦ It defines how to display HTML elements
1.9 The Web Programmer’s Toolbox
HTML: To describe the general form and layout of documents
An HTML document is a mix of content and controls
Controls are tags and their attributes
Tags often delimit content and specify something about how the content should be arranged in the
document
Attributes provide additional information about the content of a tag
Tools for creating HTML documents
HTML editors - make document creation easier
Shortcuts to typing tag names, spell-checker,
WYSIWYG HTML editors
Need not know HTML to create HTML documents
1.9 The Web Programmer’s Toolbox (continued)
Plug ins: Integrated into tools like word processors, effectively converting them to WYSIWYG HTML
editors
Filters: Convert documents in other formats to HTML
Advantages of both filters and plug-ins:
- Existing documents produced with other tools can be converted to HTML documents
- Use a tool you already know to produce HTML
Disadvantages of both filters and plug-ins:
- HTML output of both is not perfect - must be fine tuned
- HTML may be non-standard
- You have two versions of the document, which are difficult to synchronize
1.9 The Web Programmer’s Toolbox (continued)
XML
- A meta-markup language
- Used to create a new markup language for a particular purpose or area
- Because the tags are designed for a specific area, they can be meaningful
- No presentation details
- A simple and universal way of representing and transmitting data of any textual kind
JavaScript
- A client-side HTML-embedded scripting language
- Only related to Java through syntax
- Dynamically typed and not object-oriented
- Provides a way to access elements of HTML documents and dynamically change them
1.9 The Web Programmer’s Toolbox (continued)
Flash
- A system for building and displaying text, graphics, sound, interactivity, and animation (movies)
- Two parts:
1. Authoring environment
2. Player
- Supports both motion and shape animation
- Interactivity is supported with ActionScript
- PHP
- A server-side scripting language
- Similar to JavaScript
- Great for form processing and database access
through the Web
1.9 The Web Programmer’s Toolbox
(continued)
Ajax
- Asynchronous JavaScript + XML
- No new technologies or languages
- Much faster for Web applications that have
extensive user/server interactions
- Uses asynchronous requests to the server
- Requests and receives small parts of documents, resulting in much faster responses
Java Web Software
- Servlets – server-side Java classes
- JavaServer Pages (JSP) – a Java-based approach to server-side scripting
- An alternative to servlets
- JavaServer Faces – adds an event-driven interface model on JSP
1.9 The Web Programmer’s Toolbox (continued)
- [Link]
- Does what JSP and JSF do, but in the .NET
environment
- [Link] languages to be used as
server-side scripting language
- [Link] documents are compiled into classes
- Ruby
- A pure object-oriented interpreted scripting language
- Every data value is an object, and all operations are via method calls
- Most operators can be redefined by the user
- Both classes and objects are dynamic
- Variables are all type-less references to objects
1.9 The Web Programmer’s Toolbox (continued)
Rails
- A development framework for Web-based applications
- Particularly useful for Web applications that access databases
- Written in Ruby and uses Ruby as its primary user language
- Based on the Model-View-Controller architecture
Client-Side Programming
Scripting language: a lightweight programming language
Browser scripting: JavaScript
◦ Designed to add interactivity to HTML pages
◦ Usually embedded into HTML pages
◦ What can a JavaScript Do?
◦ Put dynamic text into an HTML page
◦ React to events
◦ Read and write HTML elements
◦ Validate data before it is submitted to a server
◦ Create cookies
◦ …
Server-Side Programming
The requests cause the response to be generated
Server scripting:
◦ CGI/Perl: Common Gate Way Interface (*.pl, *.cgi)
◦ PHP: Open source, strong database support (*.php)
◦ ASP: Microsoft product, uses .Net framework (*.asp)
◦ Java via JavaServer Pages (*.jsp)
◦…
CGI
Common Gateway Interface:
◦ CGI provides a way by which a web server can obtain data
from (or send data to) database, and other programs, and
present that data to viewers via the web.
◦ A CGI program can be written in any programming
language, but Perl is one of the most popular
What’s Ahead?
HTML …., XHTML …. HTML5
CSS….
Simple client-side interactivity (JavaScript)
Simple server-side interactivity (CGI/Perl, PHP)
In this course, we will provide an overview of the basics, and learn how to use the
web resources to help build a web page.
Website Design Principles
1. Know your purpose
2. Easy navigation
3. Responsive design
4. Consistency
5. Comfortable UI
6. Content meet goal
7. Performance
8. Feedback about progress
9. Avoid alert/dialogs when not necessary
[Link] 404/500 errors
[Link] & redability
Website Planning and Publishing
Prepare your content
text, images, etc.
Design and build your website
Prototype design before development
Find web host
Scalability, reliable, speed
Do a quality assurance audit
Test links, follow semantic structure, optimize image, check all grammar
and spellings.
Publish your website.