CMPSC 311 Midterm 1 Cheatsheet PDF

Title CMPSC 311 Midterm 1 Cheatsheet
Author Shing Lin
Course Introduction To Systems Programming
Institution The Pennsylvania State University
Pages 32
File Size 1.5 MB
File Type PDF
Total Views 128

Summary

abutalib aghayev...


Description

Defining properties of systems languages: • •

Expose details of underlying hardware Give fine-grained control

Layered view • • • •

provides service to layers above understands and relies on layers below more useful, portable, reliable abstractions constrained by performance footprint, behavior of layers below

Operating systems • • •

software layer that abstracts away messy details of hardware into useful, portable, powerful interface modules: file system, virtual memory system, network stack, protection system, scheduling subsystem, … (each of theses is a major system of its own) design and implementation have many engineering tradeoffs e.g., speed vs. (portability, maintainability, simplicity)

Systems and Layers •

layers are collections of system functions that support some abstraction to service/app above o hides specifics of implementation of layer o hides specifics of layers below o abstraction may be provided by software or hardware o examples from OS layer: processes, files, and virtual memory

abstraction should match “cognitive model” if users of system, interface, or resources How humans think •

our brains receive sensor data to perceive and categorize environment (pattern matching and classification) o things that are easy to assimilate (learn) are close to things we already know o simpler and more generic object, easier (most of the time) it is to classify

Processes •

hardware supported structures that form independent programs running concurrently within operating system o execution abstraction provides is that it has sole control of entire computer (single stack and execution context) o to see what processes are running on you UNIX system, use “ps” command

Files •





abstraction of read only, write only, or read/write data object o data file – collection of data on some media often on secondary storage (hard disk) in UNIX nearly everything is file o devices like printers, USB buses, disks, etc. o system services like sources of randomness (RNG) o terminal (user input / out devices) /dev directory of UNIX contains real and virtual devices

Virtual Memory •

Abstraction provides control over imaginary address space o Has virtual address space which is unique to process o OS/hardware work together to map address on to: ▪ Physical memory addresses ▪ Addresses on disk (swap space) o Advantages: ▪ Avoids interference from other processes ▪ “swap” allows more memory use than physically available

Byte-Oriented Memory Organization •



programs refer to virtual address o conceptually very large array of bytes o implemented with hierarchy of different memory “process” o system provides address space private to particular “process” ▪ programs can clobber its own data, but not that over others compiler + run-time system control allocation o where different program objects should be stores o all allocation within single virtual address space

Machine Words •



machine has “word size” o nominal size of integer-valued data including addresses o many old machines use 32 bits (4 bytes) word ▪ limits addresses to 4 GB ▪ too small for memory-intensive applications o current systems use 64 bits (8 bytes) words o potential address space approx. 1.8 x 1019 bytes o x86-64 machines support 48-bit addresses: 256 Terabytes machines support multiple data formats o fractions or multiples of word size o always integral number of bytes

Word-Oriented Memory Organization •

addresses specify byte locations o address of first byte in word o addresses of successive words differ by 4 (32-bit) or 8 (64-bit)

API •

Applications Programmer Interface – set of methods (functions) that is used to manipulate abstraction (e.g., printf, network sockets) o “library” of calls to use abstraction

Programming Languages •



Low-level languages (C, C++) o Hides some architectural details, is kind of portable, has few useful abstractions, like types, arrays, procedures, objects o Permits (forces?) programmer to handle low-level details like memory management, locks, threads o Low-level enough to be fast and to give programmer control over resources o Double-edged sword: low-level enough to be complex, error-prone o Shield: engineering discipline High-level languages (Python, Ruby, JavaScript, …) o Focus on productivity and usability over performance o Powerful abstractions shield you from low-level gritty details (bounded arrays, garbage collection, rich libraries, …) o Usually interpreted, translated, or compiled via intermediate representation o Slower (by 1.2 to 100+) o Less control

Main attributes of UNIX • • •

Multiuser – supports multiple users on system at same time, each working with their own terminal Multitasking – support multiple programs at a time Portability – when moving from hardware to hardware only lowest layers of software need to be reimplemented

Linux •

Can be viewed as software layers o OS kernel – direct interaction with hardware/firmware o System calls – interface to kernel o System libraries – wrappers around system calls o Programming language libraries – extends system libraries o System utilities – application-independent tools (e.g., fsck, fdisk, ifconfig, mknod, mount, nfsd) o Command interpreted, command shell – user interface (in terminal program) o Application libraries – application-specific tools o Applications – complete programs for ordinary users ▪ Some applications have their own command shells and programming language facilities (e.g., Perl, Python, …)

Open source •

How many software systems in use today are distributed o Distributed with license where copyright allows user of source to review, modify, and distribute with no cost to anyone o Variants of this arrangement allow person (a) to derive software from distribution and recharge or (b) never charge anyone for derivative works

Operating Systems •

Software that: o Directly interacts with hardware ▪ OS is trusted to do so; user-level programs are not ▪ OS must be ported to new HW; user-level programs are portable o Manages (allocates, schedules, protects) hardware resources ▪ decides which programs can access which files, memory locations, pixels on screen, etc.., and when o Abstracts away messy hardware devices ▪ Provides high-level, convenient, portable abstractions ▪ E.g., files vs. disk blocks

OS is “layer below” • •

Module that your program can call with system calls Provides powerful API (POSIX API)

Protection system • • •



OS isolates processes from each other but permits controlled sharing between them through shared name spaces (e.g., FS names) OS isolates itself from processes, therefore, must prevent processes from accessing hardware directly OS is allowed to access hardware o when user processes run, CPU is in unprivileged (user) mode o when OS is running, CPU is in privileged (kernel) mode o user-level processes invoked system call to safely enter OS Example o CPU (thread of execution) is running user-level code in process A; that CPU is set to unprivileged mode o Code in process A invoked system call; hardware then sets CPU to privileged mode and traps into US, which involved appropriate system call handler o Because CPU executing thread that’s in OS is in privileged mode, it is able to use privileged instructions that interact directly with hardware devices like disks o Once OS has finished servicing system call (which might involved long waits as it interacts with HW) it sets CPU back to unprivileged mode and returns out of system call back to user-level code in process A o Process continues executing whatever code that is next after system call invocation

Hardware Privilege Modes •

• • •

Hardware stat that restricts operation that code may perform o E.g., prevent direct access to hardware, process controls, and key instructions User mode – for normal programs running with low privilege (also system services that run in “user space”) Kernel mode – operating system running Unrelated to superuser (root, administrator) privileges

Device Drivers •

Software module (program) that implements interface to piece of real of virtual hardware (often needs kernel mode privilege) o E.g., printers, monitors, graphics cards, USB devices, etc. o Often provided by manufacturer of device o For performance reasons, driver is commonly run within operating system as part of kernel (in kernel space) o In past device drivers were often directly compiled into kernel (where extensions to operating system) ▪ Required administrator to recompile operating system when new device type was introduced ▪ Each system had different kernel

Recompiling Kernels •







Recompilation of kernel is problematic o Takes long time o Requires sophistication o Versioning problems Solution 1 o User-space modules – creating user-space programs that support operating system o Leverages protection against buggy code o Allows independent patching and upgrading o Removes dependency on kernel version (mostly) o Problem: performance (interacting with user space is often must slower than inkernel operations) Solution 2 o Kernel modules (loadable kernel modules) – software modules that run in kernel space that can be loaded and unloaded on running system ▪ Can extended kernel functionality without recompilation ▪ Trick is that kernel provides generic interfaces (APIs) that module uses to communicate with kernel ▪ Used by almost every modern OS (OSX, Windows, etc.) To see what modules are running on your UNIX system, use “lsmod” command

Command line • • • •

Shell program (“bash” on Linux) Interprets built-in commands Runs other programs Runs shell scripts

Hierarchical File System – file system that organizes data and program files in top-to-bottom structure • • • •

• •

Directory tree – file system of computer; files are grouped into directories and directories are organized in hierarchy Root directory – top of hierarchy, “/”; contains everything Home directory – within root directory; initial location when your first open terminal or window; contains user files Standard directory contents o Binary directories – files that contain compiled source code (machine code); can be executed on computer; sometimes called executable ▪ /sbin – system binaries (e.g.: fsck, init, route) for vital system tasks to configure operating system ▪ /lib – shared libraries and kernel mode; shared library files used by core system programs; libraries essential for binaries in /sbin o /etc – system configuration files; contains all machine-specific configuration files o /dev – device files; populated with files as kernel is recognizing hardware o /home – user home directories containing saved files, personal settings, etc. o /usr – programs and support files for users; secondary hierarchy for sharable read only user data; contains majority of multiuser utilities and applications o /var – system log files; variables files; files whose content is expected to continually change during normal operation of system (log, spool files, temporary email files); files unpredictable size; system log files Absolute path – start with root directory, “/” specifies complete path for file or directory (e.g. /usr/bin/firefox) Relative path – specifices path for file relative to current directory (e.g. if current directory is /usr, then ./bin/firefox is relative path for firefox binary)

Types of Shell Commands • • •

Binary files – separate programs executed by shell Shell built-ins – commands interpreted by shell; functionality of these commands is implemented in shell program itself Alias – shortcuts defined by users to avoid typing long commands or command sequences

Shell variables – variables that user can set to control shell’s behavior; all strings of uppercase letters •

PATH – when user types command, shell looks for binary in list of directories present in PATH variables; environment variable; colon separated list of directories; when you execute a command, shell search through each of these directories, one by one, until it finds directory where executable exists o Environment variable – variable that persists for life of terminal session

Input / Output Redirection •



• •

Every program executed in shell has three streams associated with it; when command begins running, it usually expects that three files are already open o Standard input – where program reads input from; attached to keyboard by default o Standard output – where program writes its output; attached to screen by default o Standard error – where program writes its errors; attached to screen by d efault Shell allows us to redirect these streams from their defaults o > – redirects standard output files; wipes output of file first before writing to it; if files exists it will be replaced ▪ ls -l user/bin > ls-output.txt o >> – redirects output and appends standard output to files; if file does not exist it will be created, if it exists it will be appended to end of file; file will have twice contents of /user/bin directory: ▪ ls -l/user/bin >ls-output.txt ▪ ls -l/user/bin>>ls-output.txt o 2> – redirects standard error to file ▪ ls -l/nonexistentfile 2>ls-error.txt o < – read input from file instead of keyboard ▪ cat = && || !

• • •





scope (local scope is within set of {} braces) comments: /* comment */ or // comment *to EOL* variables o must declare at start of function of block (not required since in C99) o need not be initialized before use (gcc -Wall will warm); always initialize you vars const o qualifier that indicates variable’s value cannot change o compiler will issue error if you try to violate this 0 means false, everything else true

Primitive types in C • • •

integer types (char, int) floating point (float, double) modifiers o short[int] o long[int, double] o signed[char,int] o unsigned[char,int]

Pointers • • • •



taking address of variable: & dereferencing pointer: * aliasing: *ip is an alias for i C always passes arguments by value o Value is “copied” into function o Any local modification change is not reflecting in original value passed Pointers let you pass by reference o Pass “memory location” of variable

Pass by value •

C passes arguments by value o Callee receives copy of argument o If callee (function that is called) modifies argument, caller’s copy isn’t modified

Pass-by-reference •

Can use pointers to pass by reference o Callee still receives copy of argument, but argument is pointer* o Pointer’s value points to variable in scope of caller o Gives callee a way to modify variable that ’s in scope of caller

Arrays • • •

Bare, contiguous block of memory of correct size Array of 6 integers requires 6 x 4 bytes = 24 bytes of memory Have no methods, do not know their own length (no bounds checking) o C doesn’t stop you from overstepping end of array o Many security bugs come for this (buffer overflow)

Strings • • •

array of char terminated by NULL character ‘\0’ are not objects, have no methods; string.h has helpful utilities

Errors and exceptions • • •

C has no exceptions (no try / catch) Errors are returned as integer error codes from function Sometimes makes error handling ugly and inelegant

Crashes – I f you do something bad, you’ll end up spraying bytes around memory hopefully causing “segmentation fault ” and crash Objects – there aren’t any; struct is closest feature (set of fields) Memory management • • • • • • •

No garbage collector Anything you allocate you have to free (memory leaks) Local variables are allocated off of stack Freed when you return from function Global and static variables are allocated in data segment are freed when your program exits You can allocate memory in heap segment using malloc() Failing to free is a leak, double-freeing is error (hopefully crash)

Console I/O – C library (libc) has portable routines for reading / writing, e.g., scanf(), printf() File I/O •

• •

C library has portable routines for reading / writing o fopen(), fread(), fwrite(), fclose(), etc. o does buffering by default, is blocking by default OS provides system calls o Low level binary read and writes, e.g., read(), write(), open(), close() Network I/O o C standard library has no notion of network I/O o OS provides (somewhat portable) routines o Lots of complexity lies here o Errors: network can fail o Performance: network can be slow o Concurrency: servers speak to thousands of clients simultaneously

Libraries • • • • • •

C has very few compared to most other languages No built-in trees, hash tables, linked lists, sort, etc. You have to write many things on your own Particularly data structures Error prone, tedious, hard to build efficiently and portably Less productive language than Java, C++, Python, or others

Functional prototype – body less function declaration UNIX Std* •

• • •

Three predefined streams provided to all UNIX programs o Standard input (stdin) o Standard output (stdout) o Standard error (stderr) printf(“this is printed to standard output\n”); fprintf(stdout, “this is printed to standard output as well\n”); fprintf(stderr, “this is printed to standard error\n”);

Pointers • • •

type *name; // declare pointer type *name = address; // declare + initialize pointer pointer is variable that contains memory address

Dereferencing pointers •

dereference across memory referred to by pointer o *pointer // dereference pointer o *pointer is alias for variable pointer points to o *pointer = value; // dereference / assign o pointer-p = pointer-q // pointer assignment

Pointers as function arguments • • •

pointers allow C to emulate pass by reference o enables modifying out parameters, efficient passing of in parameters okay to return passed in pointer (or dynamically allocated memory) not okay to return address of local variable

Variable Storage Classes o auto – these are automatically allocated and deallocated variables (local function variables declared on stack) o global – globally defined variables that can be accessed anywhere within program o keyword extern is used in .c/.h files to indicate variable defined elsewhere o declared outside of function (without any keyword) ▪ accessible from anywhere within same file ▪ accessible in other files via extern keyword o static – variable that is global to local file only o keyword static is used to identify variables as local only o declared outside of function with static keyword o can also appear within function: ▪ limits scope of function ▪ unlike automatic variables, preserve changes across invocations

o in general, static or global variables are given default value (often zero) and auto storage class variables are indeterminate meaing compiler can do anything it wants, which for most compilers its take whatever value is in memory; you cannot depend on indeterminate values o global and static variables: o initalized to supplied or default values before program execution beings o preserve changes until end of program execution o also apply to functions o by default, functions are global o functions in C cannot be nested; hence no static function within function Arrays o type name[size]; o initially each array element contains garbage data o array does not know its own size o sizeof(scores) is not reliable; only works in some situations o C99 standard a...


Similar Free PDFs