Advanced topics - Final exam notes PDF

Title	Advanced topics - Final exam notes
Course	Advanced Topics in Computer Science
Institution	Brunel University London
Pages	11
File Size	368.8 KB
File Type	PDF
Total Downloads	226
Total Views	399

Preview

CLICK TO PREVIEW PDF

Summary

Description

What is a Crypto currency? It is a digital currency in which encryption techniques are used to regulate the generation of units of currency and verify the transfer of funds, operating independently of a central bank. What is a bitcoin? * A decentralized Cryptocurrency whose distribution and exchange are independent from any control or regulated such as central bank or by any state. * Bitcoins must be mined by powerful computers which have to solve mathematical problems and there is a maximum of 21 million bitcoins out there of which 12 million have already being mined. * Bitcoins can be transferred from one user to another without any credit card companies or any intermediaries and all transactions are public and anonymously stored. What is a Distributed Ledger? Distributed ledger- it is a database that is consensually shared and synchronized across network spread on multiple sites. It allows transactions to have a public witness, thereby making a cyberattack more difficult

DEFINITION of 'Distributed Ledgers' A distributed ledger is a database that is consensually shared and synchronized across network spread across multiple sites, institutions or geographies. It allows transactions to have public "witnesses," thereby making a cyberattack more difficult. The participant at each node of the network can access the recordings shared across that network and can own an identical copy of it. Further, any changes or additions made to the ledger are reflected and copied to all participants in a matter of seconds or minutes. Underlying the distributed ledger technology is the blockchain, which is the technology that underlies bitcoin. Distributed ledger technology (DLT) - it is a digital system for recording the transaction of assets in which the transactions and their details are recorded in multiple places at the same time What is a Blockchain? * A blockchain is an incorruptible digital ledger of economic transactions that can be programmed to record not just financial transactions but virtually everything of value * A blockchain, as the name implies, is a chain of digital “blocks” that contain records of transactions. Each block is connected to all the blocks before and after it

* Traditional currency goes through central payment processor like a Credit Card Company, however, all Bitcoin transactions are processed by a large distributed network of computers running special software. * Whenever a transaction occurs the network records the sender’s and receiver’s bitcoin addresses and the amount transferred and enters this information on to the end of a ledger or a record known as Blockchain What is within a blockchain? Data: The data stored within a block depends on the type of blockchain Example: A bitcoin blockchain, stores the details of the transaction such as the sender, receiver and the amount of coins Hash: Similar to signing a check, cryptographic signatures determine which transactions are valid. Signatures are generated from a hash of data to be signed, and a private key. * It identifies a particular block and all of its content and it is always unique to that particular block * Once a block is created it has is being calculated, changing something inside the block will cause the hash to change, making it useful to detect changes * The first block (block 0), is the ancestor that every bitcoin block can trace its lineage back to , since every bitcoin traces back to a past one How does blockchain work? * The ledger of all transactions or records are held by everyone, the Blockchain is updated over 100 times per day and it’s sent to every computer that processes a bitcoin. * Because each transaction is encrypted with public key cryptography and verified by multiple points in the network to ensure that every computer that processes bitcoin is using identical correct copies of the Blockchain, it’s virtually impossible to counterfeit. Blockchain VS Bitcoin Bitcoin is a cryptocurrency based on a blockchain Blockchain is the underlying mechanism for bitcoin. Although initially created for Bitcoin, Blockchain provides a more secure and transparent way of processing all kinds of data and therefore, the various applications and uses of this technology are endless. Blockchain security (Built in trust)

* Hashing- Changing a single a block will make all the following blocks after invalid, each block is connected to all the blocks before and after it. This makes it difficult to tamper with a single record because a hacker would need to change the block containing that record as well as those linked to it to avoid detection * Proof of work mechanism- is the original consensus algorithm in a Blockchain network, this algorithm is used to confirm transactions and produce new blocks to the chain. With POW, miners compete against each other to complete transactions on the network and get rewarded. * P2P network- In the case of Bitcoin, instead of bank validating financial transactions – like sending money from A to B – checking the digital ledger of who owns what stored on their server, a P2P network of computers running the bitcoin protocol validate transactions by majority consensus. The consensus rules of the Bitcoin network govern how the participants in the network interact with each other * Network participants have their own private keys that are assigned to the transactions they make and act as a personal digital signature. If a record is altered, the signature will become invalid and the peer network will know right away that something has happened.

Hashing – refers to the concept of taking an arbitrary amount of input values, apply some algorithm to it, and generate a fixed-size output data call the hash. Input can be indefinitely big. The point is that the algorithm take infinite inputs of bits, applies some calculations to them, and output a finite number of bits, for e.g, 256bits. Hashes represent current state of the world. The input is the entire state of the blockchain, meaning all the transactions that have taken place so far and the resulting output hash represents the blockchain. Private/Public key – when someone sends crypto coins over blockchain, they send it to hashed version known as public key. Private keys are used to derive public keys. Private keys shouldn’t be shared. The private key generates a signature for transactions, basically, you sign the cryptocurrencies you send to others using a private key. Encryption – prevents sensitive info from getting into wrong hands, misused, or forged. The blockchain ledger records every transaction, it gets verified, uploaded and secured by anyone that on that particular blockchain.

Mining: blockchain mining involves adding transactions to the existing blockchain ledger of transactions distributed among all users of a blockchain. Mining involves creating a hash of a block of transactions that cannot be easily forged, protecting the integrity of the entire blockchain without needing a central system. Bitcoin mining is the process by which transactions are verified and added to the public ledger, known as the block chain, and also the means through which new bitcoin are released. ... The

mining process involves compiling recent transactions into blocks and trying to solve a computationally difficult puzzle.

SOFTWARE ARCHITECTURE Definition: The software architecture of a program or computing system is the structure or structures of the system, which comprise software elements, the externally visible properties of those elements, and the relationships among them. Software architecture or structure build on specific problems. Protocols for communication, synchronization and data access; assignment of functionality to design elements; physical distribution; composition of design elements; scaling and performance; and selection among design alternatives. An architectural model is a type of scale model - a physical representation of a structure - built to study aspects of an architectural design or to communicate design ideas.

Quality Attributes  



Availability -> structure software system which doesn’t have failures and if it does it can detect and recover from them. Modifiability -> one of the most important as software systems expected to change by nature quite a lot. Not all the elements have to change, how to localize a change and the dependencies of the elements. Performance -> how fast the system responds to any change of events

   

Security -> a measure of the system’s ability to resist unauthorized usage while still providing access and its services to legitimate users. Testability -> refers to the ease with which software can be made to demonstrate its faults through testing Usability -> concerned with how easy it is for the user to accomplish a desired task and kind of user support the system provides. Interpretability ->

An architectural style/pattern defines a vocabulary of components and connector types, and a set of constraints on how they can be combined. Each style comes with pros and cons based on the quality attributes. While designing a software architecture knowing the objective of the system, a suitable style is chosen. Usually different styles are used depending on the objectives.

Client Server Style  

The client makes requests to the server The server in many cases is a database with application logic represented as stored procedures.

Components: Clients and Servers (server can connect to multiple clients without knowing the identity. Clients know server’s identity) Connectors: RPC –baed network interaction protocols Configuration: Two tiers Advantages   

Higher security (server can be protected by implementing encryption protocols as well define a constraint to use certain security protocols) Centralized data access Ease of maintenance (modifying user interfaces without modifying servers)

Disadvantages  

Scalability or extensibility (if theres one server and many clients accessing the server, there can be performance issues) Reliability (if the server fails )

When to use it   

Web application When you want to centralize data storage, backup. And management functions It will support many client or client types and different devices

Layered Architectural Style  

Corresponds to a stack of layers, each layer providing a service to the layer above and serving as a client to the layer below In a pure layered system layers are hidden to all except adjacent layers

Connectors: Protocols that determine how the layers will interact Components: Reside in the layers in the hierarchy Used in: layered communication protocols (TCP/IP)

Strengths   

Support designs based on levels of abstraction, partitions complete problems Ease of maintenance -if layers only interact with layers above and below, change has minimum effect. Reuse -different implementations of the same level can be interchanged

Weaknesses 

Not all systems are easily structured in a layered fashion



Performance when layers not adjacent need to interact

Peer to Peer Style    

A network of loosely coupled autonomous peers, each peer acting as a client and a server. Peers hole state and behavior, P2P decentralizes the information and control. If the desired information is located, the peers obtains the direct address Popular in file-sharing applications

Components: Peers- autonomous components Connectors: Network protocols (wireless, /antenas, mobile devices communicating with each other) Topology: Dynamic network Qualities: Scalable (peers appears and disappear and still the system will be working Robust (no point of failure) Example: BITTORRENT   

Every user is a peer A centralized machine called the tracker is used to oversee the process by which a file is distributed to an interested set of peers, but the tracker is not responsible for the transfer. Meta data is associated with each file: size of the file

When to use:  

in a distributed systems, can be heterogenous devices, and mutually independent. Robustness in face of independent failures and high scalability required.

Avoid it when: 

Trustworthiness of independent peers cannot be assured or managed



Designed nodes to support resource discovery unavailable

When a software system’s implementation diverges from its designed architecture, it is referred to as architectural drift. This usually happens:     

Software evolution Maintenance when the software undergoes changes as a result of bug fixes and updates. During initial implementation of the system because the programmers do not understand the architecture Due to time pressure, programmers focus on delivering rather on the architecture Believe that architecture needs changes

Architecture recovery: To provide suitable processes, techniques and tools for achieving implementations that are consistent with architecture. Perform recovery or reverse engineering. Use consistency tools such as JITTAC.

Just in time architectural consistency tool It gives us a very sense of what has actually happened. It has always been very difficult to visualize or internalize this.

Data Analytics Big Data: High Volume, Velocity and Variety (Gartner) Is data where analysis is non-trivial in terms of computing power (Prof. Hand) Understand the problem and collect data in a more targeted way Big Data Analysis: design algorithms to help us to extract underlying patterns within the data Sources:    

Specialist devices (Medicine (bp, heart rate), Engineering(Hubble telescope takes real time images of the galaxy and analyzes it.) Sensor data (remote sensors to record ozone, nitroxide, air pollution, cctv, to record and generate data) Publications (Newspapers, websites, medical and scientific online publication) Social media (users putting up data willingly)

Issues:  

Expensive to generate and store data Data alone does not solve problems, expertise is essential.

 



Data privacy and security What they do is store all of that wonderful data you’ve captured in separate, disparate units, that have nothing to do with one another and therefore no insights can be gathered from this data because it simply isn’t integrated on the back end Data integration – or to be technical, data harmonization – is absolutely essential for getting the full advantage out of your Big Data. Data integration addresses the backend need for getting data silos to work together so you can obtain deeper insight from Big Data.

Techniques: 1. Visualization and integration 2. Selecting and clustering data (identifies people’s behavior and direct products) 3. Network model (efficient way to model lots of complex interactions in an efficient manner. Useful if important points in a network can be identified.)  It can represent correlation  Partial correlation  Equations which captures relationship between points in a network  False discovery rate (where it seems all of links between things are important however it is not as the modern data is quite different)

Search based software engineering Definition: Search Based Software Engineering (SBSE) is an approach to software engineering in which search based optimization algorithms are used to identify optimal or near optimal solutions and to yield insight. SBSE has been applied to problems throughout the SE lifecycle, from requirements and project planning to maintenance and reengineering. The approach is attractive because it offers a suite of adaptive automated and semiautomated solutions in situations typified by large complex problem spaces with multiple competing and conflicting objectives.

How search works: In Search based software engineering, the term search is used to refer to search as in search based optimization not search in the sense of web search or library search. Typically, a ‘near optimal’ solution is sought in a search space of candidate solutions, guided by a fitness function that distinguishes between greater and lesser solutions. Benefits: 1. Generality: As the many SBSE surveys reveal, SBSE is very widely applicable. We can make progress with an instance of SBSE with only two definitions: a representation of the problem and fitness function that captures the objective or objectives to be optimized. 2. Robustness: SBSE’s optimisation algorithms are robust. Often the solutions required need only to lie within some specified tolerance. Those starting out with SBSE can easily become immersed in ‘parameter tuning’ to get the most performance from their SBSE approach.

3. Scalability Through Parallelism: Search based optimisation techniques are often referred to as being ‘embarrassingly parallel’ because of their potential for scalability through parallel execution of fitness computations. Several SBSE authors have demonstrated that this parallelism can be exploited in SBSE work to obtain scalability through distributed computation. 4. Re-unification: SBSE can also create linkages and relationships between areas in Software Engineering that would otherwise appear to be completely unrelated. For instance, the problems of Requirements Engineering and Regression Testing would appear to be entirely unrelated topics; they have their own conferences and journals and researchers in one field seldom exchange ideas with those from the other. Algorithms used: Hill Climbing: Hill Climbing selects a point from the search space at random. It then examines candidate solutions that are in the ‘neighbourhood’ of the original; i.e. solutions in the search space that are similar but differ in some aspect, or are close or some ordinal scale. If a neighbouring candidate solution is found of improved fitness, the search ‘moves’ to that new solution. It then explores the neighbourhood of that new candidate solution for better solutions, and so on, until the neighbourhood of the current candidate solution offers no further improvement. Such a solution is said to be locally optimal, and may not represent globally optimal solutions (as in Figure 3a), and so the search is often restarted in order to find even better solutions (as in Figure 3b). Hill Climbing may be restarted as many times as computing resources allow. Simulated Annealing: Simulated Annealing (Figure 5), first proposed by Kirkpatrick et al. [56], is similar to Hill Climbing in that it too attempts to improve one solution. However, Simulated Annealing attempts to escape local optima without the need to continually restart the search. It does this by temporarily accepting candidate solutions of poorer fitness, depending on the value of a variable known as the temperature. Initially the temperature is high, and free movement is allowed through the search space, with poorer neighboring solutions representing potential moves along with better neighboring solutions. As the search progresses, however, the temperature reduces, making moves to poorer solutions more and more unlikely. Eventually, freezing point is reached, and from this point on the search behaves identically to Hill Climbing. Genetic Algorithm: Genetic algorithms are categorized as global search heuristics. Genetic algorithms are typically applied to discrete algorithms. Genetic algorithm is a field of artificial intelligence in computer science which is a search heuristic technique. It mimics the process of natural selection in a search heuristic technique. Genetic algorithms have easy implementation but there behavior is difficult to understand. Two things are required for a genetic algorithm: a genetic representation and fitness function.

How has search helped:    

Requirement engineering: Can best possible subset of requirements that matches with user requests. Debugging: Find bugs and fixes them. Testing: Facebook Software optimization: finds part of software that can be optimiz...