Web Systems Final Exam Notes PDF

Title Web Systems Final Exam Notes
Author Ariel Wang
Course Web Systems
Institution University of Technology Sydney
Pages 32
File Size 1.1 MB
File Type PDF
Total Downloads 189
Total Views 366

Summary

Web Systems Week 1-Week 1 – Introduction to UnixA user’s home directory can be specified in at least two ways:(1) By the absolute path to the directory e. /home/fred(2) By the using the tilde ( ~ ) character (pron. “tild-duh”i. ~ symbolises a users home directoryAn absolute path starts at the root d...


Description

Web Systems Week 1-6 Week 1 – Introduction to Unix A user’s home directory can be specified in at least two ways: (1) By the absolute path to the directory e.g. /home/fred (2) By the using the tilde ( ~ ) character (pron. “tild-duh” i.e. ~ symbolises a users home directory An absolute path starts at the root directory and navigates through the directory structure to a particular file or directory. An absolute path always leads to the same location. A relative path starts from where the user is currently located i.e. their present working directory. A relative path can lead to different locations depending on where the user is located.

Week 2 – Operating Systems An operating system is a piece of software that sits between all programs and the computer’s hardware     

Manages your computer Runs programs Interface between user and hardware Provides services to programs and users Protects users and programs from each other

Hardware    

CPU – central processing unit e.g. Intel Core 2 Duo Memory Input/output devices e.g. mouse, keyboard, display, printer Storage e.g. flash, hard drive, DVD

Kernel   

Control the hardware directly e.g. device drivers, firmware Provide resources and services to applications e.g. CPU, memory, storage, video, mouse, keyboard Manages access to privileged resources

Applications 

Programs to do ‘something’ for the users

Services

 

Services are programs that run ‘behind the scene’ Usually provides system support e.g. security, networking

Graphical User Interface (GUI)   

A user-friendly interface on top of the operating system Often runs the shell commands transparently Shell – also known as Command Line Interface (CLI), Command Prompt, Terminal o A program that makes a set of commands available to the user

GUI vs Command Line (CLI)    

CLI – interact through the keyboard and a monitor which only prints text GUI – interact via windows, icons, menu, pointer device (called ‘WIMP’ interface) o E.g. Mac OS & Microsoft Windows Neither of them is better Why have multiple interfaces? o Customisation o Automation o Understanding

Strengths & Weaknesses of GUI Strengths    

Little/no experience required Good for graphics e.g. artwork, desktop public User friendly, intuitive Hides complexity from users

Weaknesses      

Can’t do everything  using keyboard can be faster Can crash the system User is unsure of what the OS is really doing Slows computer down Needs better hardware Hides complexity from users

Strengths & Weaknesses of CLI Strengths      

Greater flexibility Fine tuning  parameters Essential for system administration Faster, less overhead Runs on simple hardware Can run remotely



Robust – difficult to crash

Weaknesses      

Hard to learn  cryptic commands & parameters Multiple options  more than 1 way to do things Output often cryptic or non-existent Inconsistent commands  different versions of Unix? No graphics No safety net

Batch Files and Scripting Languages     

You can automate CLI’s via a Batch file You can put a sequence of commands into an executable file  CLI treats the file as a command Most CLI’s include programming features  logic, calculations, variables, user input Some GUI’s also have batch facilities Scripting languages o Sh, Bash, K shell, Windows DOS, Python, Applscript

Week 3 – Operating Systems Unix    

Has been used since 1969 Is used on most of the computers running the internet Mac OS/X is based on Unix 2 original versions: o System V – the original version from AT&T o BSD – from the University of California Berkeley

Unix Irregularities Ad Hoc Development  

Quite a lot of Unix, especially the various scripting languages and the individual commands grew up in an ad-hoc and unregulated, haphazard fashion More powerful and versatile OS  resulted in being confusing at the user level

A Standard for Unix Commands 

 

IEEE tried to standardise Unix: o Called IEEE 1003, or ‘POSIX’ o Defined: commands, utilities, system interfaces, scripting language POSIX has been largely ignored by vendors  $$$ and too complex  1990’s Unix wars o Result: inconsistency and difficulty in transferring code between systems 2002, new Single Unix Specification (SUS) agreed. If version meets spec  called UNIX, otherwise called ‘Unix-like’

File Systems Classified into:  

Logical File System – how we view the file system Physical File System – how these items are physically represented and stored

Logical File System 





Files o Executable files (programs) o Data files Directories/subdirectories o Store files and (usually) subdirectories o Often hierarchical (tree) format Partitions o Some directories may reside in different partitions from other directories o Abstracts physical infrastructure from users

Directories vs Partitions 



Directories o Create logical divisions in the file system o For organizational purposes Partitions o Create physical divisions in the file system o Can mount and unmount partitions e.g. DVD-ROM, Flash Drive o Unmounting one partition doesn't impact others  partitions are independent of each other o Microsoft Windows often calls these ‘drives’

Theory of Trees  

   

Files systems typically organised as a ‘tree’ Definition: a tree is a collection of nodes along with a relation (parenthood) o An edge is a ‘branch’ of the tree o a  b means a is the parent of b o Every node in a tree (except the root) has exactly one parent  the root has not parent o A leaf is a node that has no children o Siblings are nodes which have the same parent / - means root of the file system /home – means a ‘branch’ of the file system . – means ‘current directory’ (node) .. – means ‘parent of current directory’

Files Systems and File Manipulation

How hard disk (& SSD) is managed and organised by an operating system:   

Disk physical structure Disk logical structures File allocation methods

Disk Physical Structure A physical hard disk is organised into:    

Tracks: concentric rings on the platter Heads: reads data from a platter Cylinders: collection of all tracks on platter  which are horizontally in the same position Sectors: part of a track for data

Disk Structure 

 

A disk is a stack of magnetic platters o This stack in divided into cylinder o Each cylinder contains circular tracks  which are in turn divided into sectors Read/write operations are provided by the disk heads  move concurrently along the fixed disk arm The disk itself rotates with constant angular velocity to provide access to every sector

Disk Formatting  

Formatting is the operation which creates the physical disk structure Formatting is organising and marking the surface of a disk into tracks, sectors, and cylinders

Disk Logical Structure  

Partitions: disks can be subdivided into partitions  each is an independent storage device Blocks: the operating system views all the disk space as an array of fixed size logical blocks  a logical block is the smallest unit of data to transfer

File Allocation Methods   

Block: space is allocated to a file as one or more blocks Directory: is a table of information that the OS uses to locate blocks associated with files on a disk 3 common types of file allocation: o Contiguous allocation o o

Chained or Linked allocation Index allocation

Contiguous 

A single contiguous set of blocks is allocated to a file at the time of file creation

  

To access information in block B, this information resides at block number starting block +B Supports random access: you know exactly where every black is after the starting block A contiguous file system has a file made up of 100 blocks, numbers from 1 to 100. How many times does the file system have to be accessed to find the 50th block? o 1 time

Chained      

File is written as a collection of non-contiguous blocks File is implemented as a linked list of blocks Each block contains a (pointer to) the address of next block o Last block contains invalid (negative) number (End-Of-File marker) Directory entry contains the head (starting) block number and length of the file Chained is good for sequential access, bad for random access A file of 9 blocks resides on a file system using chained allocation. If the program knows that the record resides on the 7th block, how many blocks does it need to read to access the record and change it? o 7 blocks

Indexed   

Tree based allocation system A special ‘index’ data block will contain a list of data blocks# for the file If the file is too big, the ‘index’ data block will point to other ‘index’ data blocks

Indexed – Unix      

On UNIX, files are stored on blocked called an inode Each file/directory is referenced by an inode The inode system is used in Unix All inodes are numbered Special blocks/file on disk is called a directory Inode structure is a tree

Which Files Allocation Type?   

Contiguous – great for ‘direct’ storage & tiny file systems Chained/linked – good for archival (e.g. backup) Indexed- the only reasonable option for large systems

Complexity Theory Let n be the number of blocks in the file. To find a particular block it takes how many disk accesses?   

Contiguous: O(1) ‘about 1’ Chained/Linked: O(n) ‘roughly n blocks’ Indexed/Inode: O(logk(n)) blocks

The lower the better!!

Try this with laptop sized fil system  400,000 files, 178 GB 

Contiguous: O(1) ‘about 1’ – but not likely! o about 1 read



Chained/Linked: O(n) ‘roughly n blocks’ o 1st block – need 1 read o 98 millionth block – need 98 million reads Indexed/Inode: O(logk(n)) blocks o Log10(178x109) ≈ 11 reads needed!



Indexed allocation should be used for big file systems  Google sized file system - 15ExaByte = 15 x 1020 bytes = 150 Million TB  indexed/inode – Log 10(15x1020) ≈ 21 reads needed! Contiguous allocation is the most efficient disk allocation algorithm! Week 4 – The Web and Security Basic security principles   

Confidentiality Integrity Availability

Confidentiality Information is accessible only to authorised users: 1. Can’t be seen…  Encryption 2. …by Whom?  Authentication 3. …When? 4. …Where?  Access Controls 5. …How?  Location, transmission path, protocols

Integrity Safeguarding accuracy/completeness of  

Information Processing methods

1. Only entered/altered by authorised users 2. Cannot be altered without detection  in storage or in transit Detection 1. Use Audit trails 2. Mathematical means  Hashes  Checksums  Message digests Availability Ensuring authorised users have access to information/processing when required 1. Systems survive failures  Have hot/cold standby mechanisms 2. Systems resist attacks  Resistant to Denial of Service (DoS) attacks Users can access from authorised locations Security Service/Attack Security service – makes use of one or more security mechanisms Security attack – any action that compromises the security of information Security mechanism – a mechanism that is designed to detect, prevent, or recover from a security attack

Typical Security S   

Confidentiality – privacy  encryption Authentication – who created or sent it Integrity – has not been altered

  

Non-reputation – the order is final Access control – prevent misuse of resources Availability – permanence, non-erasure

Encryption  

Converting plain text into cipher text to prevent non-intended recipients from reading Use rot13 encryption

Secret Key Cryptography      



Most trivial crypto use Symmetric key encryption E.g. Data Encryption Standard (DES) E.g. You use a password to protect the file Problem is that the key needs to be secret and exchanged between the parties involved in communication Most common use-case: SSL (Secure Socket Layer) also known as https:// Provides: o Confidentiality – stops interception o Integrity – stops modification o Uses: o o

Authentication  verifies owner of website Public key cryptography Symmetric (shared secret) crypto

Public Key Cryptography   

Each party has 2 key  a private and public Can encrypt with one and decrypt with the other Can be used for the 4 previously mentioned security capabilities: o Authentication – sender encrypts with their private key and receiver decrypts with sender’s public key o Privacy – sender encrypts with receiver’s public key o Data integrity – if it’s changed along the way, it can’t be decrypted into anything meaningful o Non-repudiation – same reason as Authentication

Hashing Hashing is about putting a code on data 1. 2. 3. 4.

Do a checksum (modular sum of the characters in the file) cksum filename.txt Encrypt the checksum and sender’s name with the sender’s private key Receiver uses the sender’s public key to decrypt the checksum If error in the checksum, then the message has been modified along the way

Typically used in:

  

Secure email  need personal digital certificate to verify that your email wasn’t tampered Electronic documents  Adobe PDF allows digitally sign documents Validate software  when installing software on windows, the installer must be signed by Microsoft

Authentication Who are you?  

Used for Non-Repudiation & Access Control Need to authenticate o People o Organisations o Applications

How? 





Something you know o Password o PIN Something you have o Key o Token o Certificate Something you are o Fingerprint, palm print o Retinal scan o Face recognition

Web Authentication    

Usually by ‘identity’ verification Most sites use userid and password pop-up This is called ‘Basic’ authentication In Unix  typically by userid and password o Trivial setups: saved in password file /etc/passwd

Access Controls Physical  

Tight control of physical access Token based? Not tied to user?

Logical 

Enforced by: o Operating system



o Application o Security devices e.g. firewall Needs configuration = management cost

Security Mechanism: Audit Log  



Audit trails/ logs are essential Needed to o Measure effectiveness o Do forensics o Create alerts Also subject to security needs

Security & Risk Assessment   

Security should match level of Risk Assessment Internal or external threat sources Vandalism? o Can be malicious o Can be politically motivated o Industrial espionage? Theft?

Week 5 – The Web and HCI Web Servers  

Just a program that accepts requests from a browser over a network connection Returns content (e.g. a page of HTML or other data)

Web Content Management 

Web servers usually just displays web pages o From files on the file system e.g. index.html



Can be dynamically generated o Programs generate HTML or pictures o E.g. PHP, ASP.NET Sometimes large websites manage content automatically via ‘Content Management Systems’ e.g. Wordpress



Web Page Development Web Pages contain:

 

Text + Markup o HTML  Hyper-text Markup Language o CSS  Cascading Style Sheets o HTML Structures text into headings, lists, paragraphs, tables…

Web browser presents the text in a way which is aligned with the structure 

XHTML – a strict version HTML which prevents mis-interpretation of text by the browser

Web Pages 

  

Web pages are just plain text files. They contain: o Text o Markup (aka Tags) The browsers uses these ‘tags’ to control how we view the web page Web pages are HYPERTEXT – we use HYPERLINK (‘link’) to other pages via various mechanisms There are several ‘dialects’ of HTML – HTML4.0 is the most popular

The World Wide Web  

An HTML viewer is needed to see the HTML document Every Web browser has a HTML viewer called an ‘engine’ e.g. webkit (Safari), trident (Internet Explorer), gecko (Firefox)

Week 6 – The Web and HCI Problem with HTML 



HTML defines both Structure and Presentation of web pages o E.g. Structural tags: o E.g. Presentation tags: Best to separate structure from presentation o More device types (mobile) o Different rendering (printing) o Accessibility (text to voice)

Structure – defines the components and areas on the page e.g. top of page, middle section, footer Presentation – defines how the information is presented i.e. style e.g. colour, size, background colour, font Style Sheets and HTML Presentation  

Style sheets move the presentation aspects out of HTML and into separate style sheets Many presentational tags in HTML are deprecated e.g. o Their usage is discouraged in new HTML and they will eventually be dropped from the HTML standard

  

Cascading Style Sheets (CSS) is a simple mechanism for adding style to Web documents It is how we achieve the outcome of separating style from structure CSS allows us to define the behaviour of each tag or container for our content

4 types of selectors available: tag  

This can be for any HTML tag In stylesheet: p { color: pink; }

#id    

This can be for a particular element of our page Starts with a # symbol In stylesheet: #myPink { color: pink; } In Html: heading

.class   

Allows us to apply a style to a given group of elements In stylesheet: .yellowPara { color: yellow; } In HTML: this paragraph is yellow

Inline  

Effectively hardcoding a style in an element without a stylesheet In HTML: this paragraph is yellow

How to Implement CSS 4 ways in which CSS can be included in your page:    

Imported – adding the contents of an external CSS file to another set of CSS rules o Stylesheet: @import url(/css/mystyle.css) Linked – an external CSS file included in the head of the page o HTML: Embedded – including the CSS as a section in the head of the page o Within Inline – style included as an attribute of a tag o Not allowed in the Web assignment

The order of precedence that styles cascade is in: 

Browser Default settings (lowest)

    

User settings in browser Linked External CSS Imported CSS Embedded CSS Inline CSS

Week 7 – Operating Systems Programs vs Processes   

Processes are programs in execution (also called ‘tasks’) Every time a program is run, a new process is created In addition there are ‘system processes’, ‘services’ or ‘background tasks’

Process Management  

 



The role of the computer is to run our programs Process management is how the OS handles this step o Starting processes o Managing running processes o Performing interprocess communication o Terminating processes Modern operating systems can appear to run multiple programs concurrently Run one program at a time o Single tasking – e.g. traditional embedded systems o Batch processing –...


Similar Free PDFs