Inf30015 Knowledge Management Assignment PDF

Title	Inf30015 Knowledge Management Assignment
Author	Syed Karim
Course	Knowledge Management
Institution	Swinburne University of Technology
Pages	5
File Size	135 KB
File Type	PDF
Total Views	148

Preview

CLICK TO PREVIEW PDF

Summary

knowledge management...

Description

Inf30015 Knowledge Management Assignment Abstract Knowledge management has been described as a critical factor in ensuring strategic competitive advantage in the business world today. From capturing knowledge to implementing an organizational learning culture is what entails in knowledge management. This paper details into a major aspect of knowledge management which is Data mining and knowledge discovery. The paper aims to give an overview of data warehousing and data mining along with a critical review of WEKA software as an effective tool for data mining.

Introduction The rise of knowledge-based economies has kindled the importance of effective management of knowledge. The effective management of knowledge has been described as a critical ingredient for organisation seeking to ensure sustainable strategic competitive advantage. An integral part in knowledge management is to create, capture and make information available and usable inside or between organisations. KM is essentially about getting the right knowledge to the right person at the right time. In recent years, the significance of KM has been broadly perceived as the establishments of industrialized economies as they moved from natural resources to intellectual capital. A simplistic definition of such term would be “the difference between market value of a publicly held company and its official net book value is the value of its intangible assets” (Svieby 1997). In essence KM is beneficial to all sectors of business, be it educational, banking, production/manufacturing etc. There are several sectors that need to be address when implementing effective knowledge management practices. These include    

Creating an intranet Creating Data Warehouses Implementing decision support tools Implementing groupware to support collaboration

In this paper a review and critical analysis will be made on Waikato Environment for Knowledge Analysis (WEKA) software as an effective tool for data warehousing and mining. The Waikato Environment for Knowledge Analysis (WEKA) came to fruition through the apparent need for a unified workbench that would permit researchers easy access to the remarkable procedures and algorithm in machine learning. Nowadays WEKA is recognized as the benchmark system for data mining and machine learning. The paper aims to give a broad research on data warehousing and mining along with a critical review on the WEKA software as an effective tool for data mining and KM.

Research on Data Warehousing and Data mining A data warehouse is a “subject-oriented, integrated, time varying, non-volatile collection of data that is used primarily in organizational decision making. ( Inmon, W.H.,1992). Generally, the data warehouse is maintained independently from the organization’s operational databases. The primary motive behind this is the data warehouse supports on-line analytical processing (OLAP) which has functional and performance requirements that are different than those of on-line transaction processing (OTLP) softwares that are customarily supported by operational databases. A data warehouse is a specially constructed repository of data aimed to assist in decision making. The data arrives from operational systems and external sources and other wide variety of sources. The characteristics of a data warehouse are (Inmon 1992)    

Subject orientation – data can be organised around business entities Uniformity - common data elements related to multiple applications are treated consistently Time variant – data is updated as conditions change Non-volatile – data is loaded into the warehouse and retrieved with ease

When talking about a data warehousing environment we need to address an environment with the following two traits: 1. Derived information is shown for analysis 2. The environment is ever-changing, i.e many updates occur In such an environment either manual analysis is performed with the support of appropriate visualizing tools such as depicting the data into graphs or charts etc or (semi) automatic data mining can be performed ( M.Ester, Xiaowei Xu 1998). “Data mining is the process of applying intelligent methods to extract data patterns” (G. S. Reddy 2010). It is the process of sorting through large data sets to find patterns and set up connections to solve issues through data analysis. Data mining is an essential step for capturing and spreading knowledge throughout an organization and it primarily assist in predictive analysis. The stages of data mining involve   

Data exploration and clustering – which involves searching for new data and grouping them into meaningful sub-classes Modelling – Users create a model to test and evaluate Deploying models – Acting based on results.

Review of WEKA data mining software The Weka Data Mining Software has been downloaded 200,000 times since it was made available on SourceForge in April 200 and is currently downloaded at a rate of 10,000/month. The SIGKDD service

award is the highest service award in the field of data mining and knowledge discovery and it was awarded to the Weka team for their development and deployment of the freely available Weka Data Mining software along with other relevant documentations. The critical success factors for Weka are      

It delivers various algorithms for data mining and machine learning Is open source and freely available It is platform-independent It is easily useable by people who are not data mining specialists It provides flexible facilities for scripting experiments It has been well maintained and updated

How does it work?

Weka covers a comprehensive set of valuable algorithms for an array of data mining tasks. These include tools for data engineering (called “filters”), algorithms for attribute selection, clustering, association rule learning, classification and regression. The main graphical user interface is called the ‘explorer’. It has six different panels that corresponds to the multiple data mining tasks supported by it. In the ‘Pre-process’ panel data can be loaded from various sources including files, URLs and databases. The second panel allows weka’s classifications and regression algorithms. Classification algorithms generally produce decision trees while regression algorithms produce regression curves or regression trees. This panel also allows users to assess the consequential models, both mathematically through statistical estimation and graphically through visualization of the data and examination of the model given that the model structure is affable to visualization. Users can also load and save models. The third panel, “Cluster,” enables users to apply clustering algorithms to the dataset, again the outcome can be visualized. Clustering is one of two methodologies for analysing data without an explicit target attribute that must be predicted. The other one encompasses association rules, which allow users to perform a market-basket type analysis of the data. The fourth panel, “Associate,” provides access to algorithms for learning association rules. Attribute selection, another vital data mining task, is supported by the next panel. This provides access to various methods for measuring the effectiveness of attributes, and for finding attribute subsets that are predictive of the data. Users who like to examine the data visually are maintained by the final panel, “Visualize.” This presents a color-coded scatter plot matrix, and users can then select and enlarge individual plots. It is also possible to zoom in on portions of the data, to retrieve the exact record underlying a data point, and so on (E.Frank, M.Hall, G.Holmes 2009). Thus, by using the Weka software, analysts can effectively data mine and uncover hidden patterns and therefore discover knowledge and with the assistance of the machine learning algorithms the

Weka Software provides the data can be classified and be grouped with objects in meaningful subclasses within the database and can be visualized statically or graphically. Encircling Nonaka and Takeuchi’s SECI model of knowledge capture and transfer.

WEKA and Big Data analysis The Big word in Big data itself defines the volume. Big data can be quantified by size in TBs or PBs, and in addition even the transactional data, tables, or files. Moreover, one of the things that make big data really big is that it’s coming from a greater variety of sources than ever before, including logs, clickstreams, and social media. Using these sources for analytics means that common structured data is now joined by unstructured data, such as text and human language, and semistructured data, such as eXtensible Markup Language (XML) or Rich Site Summary (RSS) feeds. Without the technique of data mining for data analysis big data would not be useful as it would be impossible to summarize and simplify and discover similarities in the large amount of data being collected. WEKA datamining software offers a set of comprehensive algorithms for the two main types of data mining for Big data analysis (J.Navas, P.Catalina, G.Parra, Y.Camilo 2016) 1. Association – Which consist of finding relationships in data which is connecting variables per the similarities among them 2. Clustering - It is a type of data mining that divides in small groups the data out of which the resemblance among them is unknown, with the purpose of finding similarities among the groups and assessing the results obtained.

Conclusion Knowledge discovery and intellectual capital has become essential for organizations to stay competitive in the market. In this information age, with innumerable amount of data being available data mining is a first-rate technique to explore and exploit data, discover knowledge, retain and share within an organization to have a competitive edge. Weka has made an outstanding contribution in the data mining field and has an advantage over other data mining software as firstly, it is open source which not only means that it can be obtained free, but—more importantly—it is maintainable, and modifiable, without depending on the commitment, health, or longevity of any particular institution or company. Second, it provides a vast collection of machine learning algorithms that can be arrayed on any given problem.

References 1. Frank, E., Hall, M., Holmes, G., Kirkby, R., Pfahringer, B., Witten, I.H. and Trigg, L., 2009. Weka-a machine learning workbench for data mining. In Data mining and knowledge discovery handbook (pp. 1269-1277). 2. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P. and Witten, I.H., 2009. ‘The WEKA data mining software: an update. ACM SIGKDD explorations newsletter, pp.10-18. 3. Inmon ‘Building the datawarehouse’ 1992 4. M.Ester, Xiaowei Xu ‘Incremental Clustering for Mining in a Data Warehousing Environment’ 1998 5. J.Navas, P.Catalina, G.Parra, Y.Camilo ‘Big Data Tools: Haddop, MongoDB and Weka’ pp 449456 2016...