Web Systems - Lecture notes PDF

Title Web Systems - Lecture notes
Author Natalie Habib
Course Web Systems
Institution University of Technology Sydney
Pages 14
File Size 888 KB
File Type PDF
Total Downloads 515
Total Views 682

Summary

Web SystemsThe web is a network of networks.Unix/Unix-like Used on most computers running the internet  i. web servers, domain name servers, email servers, web hosting  Typically, not used by ordinary users (however, used by Mac OS/X)There are several versions of Unix. Many differences such as var...


Description

Web Systems The web is a network of networks. Unix/Unix-like Used on most computers running the internet  i.e. web servers, domain name servers, email servers, web hosting  Typically, not used by ordinary users (however, used by Mac OS/X) There are several versions of Unix. Many differences such as varying commands or different directory structures. Most are based from 2 original ones:  System V: original version from AT&T  BSD: from University of California at Berkeley Ad Hoc development Many Unix (especially scripting languages + commands) cumulated in an ad-hoc and unregulated, haphazard fashion.  Results in more powerful + versatile operating system  However, can be confusing for regular users IEEE 1003 or POSIX Tries to standardise Unix  Defined: commands, utilities, system interfaces and scripting language  Largely ignored by vendors due to large cost and complexity o Consequence: inconsistency and difficulty in transferring code between systems 2002: Single Unix Specification (SUS) Created standard for Unix commands  If version meets specs, can be called Unix  Otherwise, called “Unix-like” Operating Systems IBM VM/CMS Virtual Machine/Conversational Monitor System (1966)  Only operating system used for longer than Unix (1969) Reasons for Unix’ survival: No owner  Due to it being a set of ideas  Anyone can implement these ideas  i.e. files, processes, permissions and users Based on  even hardware devices are represented as files e.g. /dev/mouse simple  allowed Unix to incorporate new ideas and technologies easily concepts Portable



Free (some varieties)

  

Efficient, stable and relatively secure Unix as a set of tools approach

 

  

Written in programming language C (not tied to CPU, used by any computer with a C compiler) Technology hardware has evolved but is still conceptually the same 1993/1994 onwards: free (Linux, FreeBSD) Especially available to cheap Intel based PCs which are popular due to MS Windows Fast + stable (system crashes rare) Designed for security for multi-user systems – files have owners, tight security permissions → ↓ viruses Unix CLI’s powerful features: simple commands, pipes and I/O redirection Create powerful ad-hoc tools Appeals to technically oriented users

Operating systems Software that sits between all programs and a computer’s hardware  Manages computer  Runs programs  Interface between user + hardware  Protects users and programs from each other  Manages resources

File system

Part of the operating system and manages data storage + access. Logical User view of a file system  Files: executable files (programs), data files  Directories/subdirectories: stores files, often hierarchical (tree) format  Partitions: some directories may reside in diff. partitions from others, abstracts physical infrastructure from users Physica How items are physically represented + stored l Directories vs Partitions Directories  Create logical divisions in the file system  For organisational purposes Partitions  Create physical divisions in the file system  Can mount/unmount partitions (i.e. DVD-ROM, flash drive) → unmounting one doesn’t impact others  Partitions independent of each other  Named “drives” by Microsoft Windows



Images: A conceptual filesystem structure, some filesystems have “drives” as well which act as multiple roots as starting from the top of the diagram (as seen on right compared to left) Theory A tree    

of Trees is a collection of nodes along with a relation (parenthood) Edge: tree “branch”. A → B means a is the parent of B Every node in the tree has 1 parent except for the root which has none Leaf: node with no children Siblings: nodes with the same parent

Unix filesystem is a tree  / = root of file system  /home = branch of file system  . = current directory (node)  .. = parent of current directory

File storage  Distributed file system: many drives connected to several servers o Used by Google which has 20+ data centres  Central file system (i.e. UTS uses “SAN”)  Hard disk or SSD (Solid State Drive): Computers typically use one of these physical storage devices Hard drive and SSD’s Disk physical structure Organised into  Tracks: concentric rings on the platter  Heads: reads data from a platter  Cylinders: collects all tracks on platters – horizontally in the same position  Sectors: part of a track for data

Low level hard-disk data storage

Disk structure A disk is a stack of magnetic platters  This stack is divided into cylinders  Each cylinder contains circular tracks which are then divided into sectors  Read/write operations provided by disk heads (they move concurrently along the fixed disk arm)  Disk itself rotates with constant angular velocity to provide access to every sector

SSD Have no moving parts but emulate a rotating hard disk. New tech makes better use of SSD. Disk formatting Formatting is the operation which creates the physical disk structure. It is the organizing and marking of the surface of a disk into tracks, sectors, and cylinders Disk logical structures  Partitions: the subcategories of a disk, with each partition being an independent storage device  Blocks: operating system views all disk space as an array of fixed size logical blocks o Logical blocks are the smallest unit of data to transfer File allocation methods

There are 3 common types of file allocation:  Contiguous (right image): direct storage + tiny file systems  Chained/linked: archival purposes (backup)  Indexed (e.g. inode): only reasonable option for large systems

 

Chained/linked allocation  File written as a collection of noncontiguous blocks  File implemented as a linked list of blocks  Each block contains a (pointer to) the address of next block o Last block contains invalid (negative) number (end-of-file marker)  Directory entry contains the head (starting) block number and length of the file Chained is good for sequential access, bad for random access Sequence: 3, 9, 17, 18, 19, 20 2

Indexed allocation “Tree” based allocation system  Special “index” data block contains list of data blocks# for the file  If file too big: “index” data block points to other “index” data blocks (left)

Indexed – UNIX  On UNIX files are stored on “inode” blocks  Each file/directory referenced by an inode  Efficient use of space and fast to read blocks

 

iNodes and Directories in UNIX  Inode system used in UNIX  All inodes are numbered  Directory: special blocks/file on disk Directories contain names of files and inode no. for the file The inode structure (image) is a “tree”

What iNodes store:  File metadata: size, owner id, group id, permissions, timestamps o Doesn’t store name of file  Pointers to the blocks that stores the files data  Optionally: o Single indirect block: pointer to a disk block → contains an index of pointers to data blocks o Double indirect block: pointer to more single indirect blocks o Triple indirect block: pointer to more double indirect blocks Pictured: Nodes  

Efficiency: Complexity Theory Example: n = number of blocks in a file. To find a certain block, how many disks are accessed?  Continuous: O(1) “about 1”  Chained/linked: O(n) “roughly n blocks”  Indexed/inode: O(logk(n)) blocks  See: https://www.bigocheatsheet.com/ Left: the lower the better, O(1) is the x-axis

The structure of iNodes allow for behaviours when moving and deleting files Filename → iNode → Contents

Week 3: The web and security Security and Encryption: Principles  Confidentiality  Integrity  Availability Confidentiality Information accessible only to authorised users  i.e. can’t be seen (encryption): by whom? (authentication) When? Where? (access controls) How? (location, transmission path, protocols) Integrity Safeguarding accuracy/completeness of:  information  processing methods where information: 1. is only entered/altered by authorised users 2. cannot be altered without detection (in storage/transit) Detection:  Audit trails  Mathematical means: hashes, checksums, message digests Users can access from authorised locations. Ensures authorised users have access to information/processing when required i.e.  Systems survive failures o Hot/cold standby mechanisms  Systems resist attacks o Resistant to Denial of Service (DoS) attacks Good vs Evil  Security service: uses or more security mechanisms  Security attack: any action that compromises information security  Security mechanism: mechanism designed to detect, prevent or recover from a security attack Left: types of security attacks and their preventatives

Typical security services  Confidentiality: privacy, encryption  Authentication: who created/sent it  Integrity: hasn’t been altered  Non-repudiation: order is final  Access control: prevent misuse of resources  Availability: permanence, non-erasure Security service: Encryption Converting plaintext into ciphertext to prevent non-intended recipients from reading e.g. rot13 Secret key cryptography  Most trivial crypto use symmetric key encryption e.g. password-protec. file  However, keys need to be secret + exchanged between involved parties Public key cryptography Each party has 2 keys (private + public e.g. RSA) One encrypts and the other decrypts Can be used for the 4 previous security capabilities o Authentication: sender encrypts w/ private key + receiver decrypts with sender’s public key o Privacy: sender encrypts w/ receiver’s public key o Data integrity: if anything is changed along the way it can’t be properly decrypted o Non-repudiation: same as authentication Web Security: Encryption   

Most commonly used for SSL (Secure sockets layer, also known as https://) Provides:  Confidentiality: stops interception  Integrity: stops modification  Authentication: verifies owner of website, (optional) certificate-based security Uses:  Public key cryptography  Symmetric (shared secret) crypto Hashing Putting a code on data (i.e. email) to ensure it hasn’t been modified along the way or sent by someone else entirely. 1. Complete checksum (odular sum of the characters in the file) cksum filename.txt 2. Encrypt checksum + sender’s name with sender’s private key 3. Receiver uses sender’s public key to decrypt checksum 4. If checksum has an error = modified message Typically used in:  Secure email: some clients can verify untampered email → needs personal digital certificate  Electronic documents: adobe PDF allows you to digitally sign documents  Validate software: when installing software on windows, installer must be signed by Microsoft Security Service: Authentication Who are you? – verifies identity  Used for non-repudiation + access control  Need to authenticate people, organisations, applications  How: something you o Know: password, pin o Have: key, token, certificate o Are: fingerprint, retinal scan, facial recognition, voice recognition Where to authenticate  Web authentication: usually by “identity” verification  Most sites use userid and password popup o “basic” authentication, where password is sent scrambled but not encrypted (shows up as *** instead of actual words/letters) UNIX security: Authentication Typically, by userid + password  Trivial setups: saved in password file  Larger scale: stored in central directory service o E.g. Active Directory (Microsoft), LDAP (everyone else) Security service: Access controls  Physical: tight control of physical access, token based? Not tied to user?  Logical: enforced by operating systems, application + security devices (i.e. firewall) o Needs configuration = management cost Each file and directory have 3 sets of permission: read, write + execute (traverse if directory) Usually has 3 levels of security:  User: owner of file  Group: other users in owner’s group  Others: public Security Mechanism: Audit log Needed to: measure effectiveness, do forensics, create alerts Security and risk assessment Security should match risk assessment level  Internet or external threat?  Vandalism?  Theft? Week 4: The Web and HCI

Web servers A program that accepts requests from a browser over a network connection.  Returns content (e.g. page of HTML)  HYPERTEXT = links to other pages within a page Web content management Web servers usually just display web pages, but they can also be dynamically generated.  Programs generate HTML or pictures etc (e.g. PHP, ASP.NMET)  Content Management Systems: used by some large web sites to automatically manage content o E.g. Wordpress, Joomla Web browsers present text aligned with the structure. XHTL: A strict version of HTML, prevents misinterpretation of text by browser Web Pages Plain text files. Contain:  Text +  Markup (“tags”) o HTML: hyper-text markup language o CSS: cascading style sheets (presentation format) o HTML Structures text into headings, lists, paragraphs, tables Browsers use “tags” to control how web pages are viewed.  Web pages are HYPERTEXT (“link”) to other pages via various mechanisms  HTML has several “dialects” (most popular: HTML 4.0 Transitional) World Wide Web A HTML viewer is needed to see HTML documents. Every web browser has a HTML viewer called an “engine”

HTML http://ryanstutorials.net/ www.w3schools.com/howto

Week 5: The web and HCI Problem with HTML  Structure: defines components + areas on the page (top, middle, footer)  Presentation: defines how information is presented (i.e. style; colour, size, font) Style sheet and HTML presentation Style sheets move presentation aspects (colour, font, etc) out of html + into separate style sheets.  Many presentation styles are deprecated (eventually dropped from HTML standard e.g. )  CSS (Cascading style sheets): a mechanism for adding style to web documents. Allows us to define the behaviour of each tag or container for our content 4 selectors available: Tag For any HTML tag

Stylesheet: p {color: yellow; } HTML: this paragraph is yellow For a certain element, starts with # Stylesheet: #myYellow {color: yellow; } HTML: this para is yellow Apply a style to a group of elements Stylesheet: .yellowPara { color: yellow; } HTML: this para is yellow Hardcoding a style in an element HTML: this para is yellow without a stylesheet

#id .class Inline

Dividing pages into sections:  : apply style to elements that need to be separated, places a line break  : apply a different style to part of your content inline without line breaks How to implement CSS Imported Adding contents of an external CSS file to another set of CSS rules Linked An external CSS file included in the head of the page (BEST WAY)

Embedded

Inline

Stylesheet: @import url(/css/mystyle.css) HTML: only within HTML:

Including CSS as a section in the head of the page Style included as an attribute of a tag

CSS Positioning Allows us to position HTML anywhere on a screen (great for navigation bars w/o Javascript)  Can be relative to current position  Can be absolutely placed – by pixel or percentage  Can specify margins, borders, spacing, padding  Don’t need or anymore Cascading (C in CSS) Styles cascade in the order of precedence (lowest, 1 to highest, 7): 5. Embedded CSS 1. Browser default settings 6. Inline CSS 2. User settings in browser 7. HTML tag attributes 3. Linked external CSS 4. Imported CSS Nesting of tags When tags are nested the styles for several tags may apply to a piece of content.  Rule: inline → id → class + applied to innermost tag  E.g. Hello, World

 Internet explorer treats precedence differently

   

Benefits of CSS Definite separation of style + structure Simpler, cleaner code Ability to define the look of several pages from one location Better code reuse

 

Disadvantages of CSS Not all browsers follow standards properly e.g. internet explorer 6 “Quirks mode”: standards-based browsers try to emulate broken internet explorer

Week 6: Operating Systems 3 Programs vs Processes Processes are programs in execution (“tasks”)  There are “systems processes”, “service processes” or “background tasks” Processes management The computer runs programs, the process management is how the OS handles this step.  Starting processes  Performing inter-process communication  Managing processes  Terminating processes Managing processes Modern operating systems can appear to run multiple programs concurrently One program at a time Concurrent processing  Multiprogramming  Single tasking  Multitasking: all modern operating systems  Batch processing  Multithreading: most modern os pentium  Multiprocessing Process States Processes can be in various “states” 1. Created: i.e. loaded into memory 2. Waiting to be run by CPI 3. OS runs process 4. If process needs a resource → “blocked” (waiting) until resource received a. If not, then steps 1-3 keep looping until finished (stopped, step 6) 5. Waiting again until OS restarts process 6. When finished, OS stops process Interrupts OS runs processes continuously until interrupted. Come from:  Hardware: e.g. ctrl-alt-del keystroke, move mouse, network  Programs: e.g. runtime error, pause for input, wait for block of data to load from disk  OS forces current running process into “blocked” state + runs a special “interrupt handler” o Often passes control to a different process while waiting o Interruption occurs at step 3 (Running) Process scheduling How OS decides to run each process on which CPU  Different algorithm types decide order of processes e.g. first-in first-out, pre-emptive, round-robin Concurrent programming Various OS uses different techniques to run processes Multiprogramming Process waits for I/O, run next process : “cooperative”  Can get “hangs” if a process does no I/O or doesn’t relinquish CPU Multitasking: Sets timer on all processes – “fair share” CPU time. Switch when timer stops.  Can prioritise (low, normal, high, real-time) “pre-emptive”  Choice of queuing algorithm decides which process is run next Multithreading Multiple instances of same code e.g. same process in memory, but ≥2 execute paths  Allows OS to run process on >1 CPU  Very efficient use of resources (i.e. memory) Multiprocessing Multitasking spread amongst ≥2 CPU’s Inter Process communication (IPC) Operating systems allow communication between processes via One-way IPC: Pipes Two-way IPC Processes can communicate in both directions One-way communication.  Shared memory (bit of memory acts as file)  Process A sends data to process B  Named pipe (2 way pipe, acts as file)  Process B accepts data e.g. Bash “pipe”  Socket (network or internal interface) operator → output of one command (process) becomes the input of another command  Message queue/message passing (special (process) programming interface – passes data like o Is | more internal messages/SMS)  Semaphore (special flag/file controlling access to resources Resource management Resources are things that processes might need to run (e.g. files, network, human interface devices).

 

Kernel manages all other system resources (e.g. interrupts, I/O, system devices) Many resources require mutually exclusive access (i.e. processes wait for each resource) o Otherwise results in “resource contention” (conflict over access to a shared resource)

Two-way IPC + Resource contention Resource contention: 2 processes wanting to alter the same resource at the same time  Resource types: memory, files, hardware  Solutions to “deadlock”: o Semaphores: flag held by process changing memory o Lock files: file not readable/writable while data is being written in Deadlock Processes compete for a limited no. of resources, when the resource we’re waiting for never gets released Can occur if all these conditions are held simultaneously:  Mutual exclusion: a resource can only be accessed by one process at a time  Hold and wait: there’s a process that holds at least 1 resource + is waiting for another reso...


Similar Free PDFs