Lab 02 - Guidance and Lab work PDF

Title	Lab 02 - Guidance and Lab work
Course	Data Analysis
Institution	Seneca College
Pages	16
File Size	1.6 MB
File Type	PDF
Total Downloads	89
Total Views	134

Preview

CLICK TO PREVIEW PDF

Summary

Guidance and Lab work - Questions included...

Description

Lab 02: Creating an Elasticsearch Cluster

Mangesh Bhattacharya

2022-01-28 — SRT 411: Digital Data Analysis — Dr. Asma Paracha

Table of Contents Task 1: Create & Test the Cluster....................................................................................................................3 Task 2: Data Query.........................................................................................................................................9 Task 3: Index Management............................................................................................................................13 Task 4: Answer the following Question..........................................................................................................14 Conclusion:..................................................................................................................................................15 Citation........................................................................................................................................................16

Creating an Elasticsearch Cluster

PAGE 2

Task 1: Create & Test the Cluster In Lab 02, we had to create, configure, and test our cluster. We have already installed elasticsearch in our Red hat 8

virtual machine.

To

get the result for

cluster-1

or

master-cluster,

we had

/etc/elasticsearch/elasticsearch.yml and add the following information manually: 1. Node name, master & data:

2. Cluster Name:

3. Network Host:

Creating an Elasticsearch Cluster

PAGE 3

to

edit

4. Discovery: Contains master, machine 1, and machine 2 IP addresses

Secondly, we had to edit and configure Kibana by going into /etc/Kibana/kibana.yml. However, after that step to start kibana, we must use the systemctl command and check the status of the service: 1. /etc/kibana/kibana.yml: edited – server.port, server.host, server.name & elasticsearch.hosts

Creating an Elasticsearch Cluster

PAGE 4

2. sudo systemctl start, status kibana and elasticsearch:

Thirdly, I created two (2) VMs to join the master node or master machine. Here I created two CentOS 9 machines called Machine 1 and Machine 2. Machine 1 IP address is 192.168.13.129 and Machine 2 IP address is 192.168.13.130. After creating these two machines, I installed Elasticsearch and Kibana and then configured kibana and Elasticsearch to connect the machine’s node 1 to the master node and machine node 2. The images below will describe the following: 1. /etc/elasticsearch/elasticsearch.yml – Node 1 – Srt411data1

Creating an Elasticsearch Cluster

PAGE 5

Creating an Elasticsearch Cluster

PAGE 6

2. /etc/kibana/kibana.yml – Node 1 – 192.168.13.129

3. sudo systemctl start, status kibana & elasticsearch

Creating an Elasticsearch Cluster

PAGE 7

Starting the nodes one-by-one, once all three machines are configured and running perfectly after starting the services of each node, i.e., Master, Node 1, and Node 2: Master: Srt411master, 192.168.13.128

Node 1: Srt411data1 – 192.168.13.129

Node 2: Srt411data2 – Static IP – 192.168.13.130 Creating an Elasticsearch Cluster

PAGE 8

In this case, I uninstalled and reinstalled the 2 nodes because they were not connecting whether I stopped the firewall or not. Therefore, after a good recommendation, I went ahead and re-installed my machines:

Task 2: Data Query

Creating an Elasticsearch Cluster

PAGE 9

The difference between the POST and PUT command is that the request methods used in POST cannot be cached or stored, whereas the methods used in PUT can be stored, for example, while sending the request, the sequence number change from 5 to 6 to 7. Refer to the image below: POST method:

The GET command results:

Creating an Elasticsearch Cluster

PAGE 10

The DELETE command result:

Retrieve document 2 of myData result:

Creating an Elasticsearch Cluster

PAGE 11

Creating an Elasticsearch Cluster

PAGE 12

Task 3: Index Management Screenshots of showing the Cluster health and information about the nodes using index management: 1. Cluster State:

2. Cluster Health:

The above screenshot still shows the Cluster Health as yellow; it means either the shards have not been recognized or appropriately loaded. Need to analyze. Creating an Elasticsearch Cluster

PAGE 13

Task 4: Answer the following Question 1. What is a cluster and node in Elasticsearch? a. A cluster is a collection of nodes with the same name attribute, and nodes are a solitary server essential for the cluster. It stores the information and is interested in the cluster’s ordering and search capacities.

2. What are the main differences between master and data nodes? a. The primary nodes are liable for handling the clusters, while the information or data nodes, as the name proposes, are accountable for the information. For example, while installing and configuring the primary and data nodes, it is recommended to install at least two (2) primary nodes and two (2) data nodes that will later require configuration to scale on dependable machines with a secure and swift storage capacity.

3. Which type of nodes holds the documents? a. Master nodes hold most of the documents, for instance, creating, removing indexes, tracing nodes, & distributing shards to nodes.

4. Describe how ES stores data? a. There are two varieties of data sets that need to be stored in ES are: i. ii.

JSON documents contain lists, texts, mapped coordinates, and CSV numbers. Binary data or information

b. If the records are put away remotely, Elasticsearch is utilized for its indexing, picking to deactivate the _source field. Elasticsearch will automatically store each piece of information typed into it, so it works both as a web search and a manuscript store. It can also store binary data.

5. Describe the benefits of JSON objects. a. Certain benefits include detecting the types of data, listing the number of records, and enabling it for web-search.

6. Compare a traditional RDBMS and ES for data storage. a. The relational database can store data & also index that same data. RDBMS is better in reading the currently composed execution. Elasticsearch engines are better at rapid inquiry with extra tricks such as a wide range of standardizations.

Creating an Elasticsearch Cluster

PAGE 14

Conclusion: In Lab 02, we had to create, configure, and test our cluster. We have already installed elasticsearch in our Red hat 8 virtual machine. To get the result for cluster-1 or master-cluster, we had to edit /etc/elasticsearch/elasticsearch. Secondly, we had to edit and configure Kibana by going into /etc/Kibana/kibana. After that step to start kibana, we must use the systemctl command and check the status of the service. However, I created two VMs that can join the master node or master machine. After creating these two machines, I installed Elasticsearch and Kibana and then configured kibana and Elasticsearch to connect the machine's node 1 to the master node and machine node 2. The difference between the POST and PUT command is that the request methods used in POST cannot be cached or stored, whereas the methods used in PUT can be stored, for example, while sending the request, the sequence number change from 5 to 6 to 7. A cluster is a collection of nodes with the same name attribute, and nodes are a solitary server essential for the cluster. It stores the information and is interested in the cluster's ordering and search capacities. The primary nodes are liable for handling the clusters, while the information or data nodes, as the name proposes, are accountable for the information. While installing and configuring the primary and data nodes, it is recommended to install at least two primary nodes and two data nodes that will later require configuration to scale on dependable machines with a secure and swift storage capacity. Master nodes hold most of the documents creating, removing indexes, tracing nodes, & distributing shards to nodes. There are two varieties of data sets that need to be stored in ES: JSON documents, which contain lists, texts, mapped coordinates, CSV numbers, and Binary information. If the records are put away remotely, Elasticsearch is utilized for its indexing, picking to deactivate the source field. Elasticsearch will automatically store each piece of information typed into it, so it works both as a web search and a manuscript store.

Creating an Elasticsearch Cluster

PAGE 15

Citation 

“What

is

Elasticsearch?,”

Magedelight

Resources,

30-Dec-2020.

[Online].

Available:

https://www.magedelight.com/resources/what-is-elasticsearch-benefits-of-magento-2elasticsearch/#:~:text=In%20simple%20words%2C%20ElasticSearch%20accepts%20JSON %20documents%2C%20detects,it%20automatically%20gets%20added%20to%20the%20mapping %20definitions. [Accessed: 25-Jan-2022]. 

“Comparing

Database Management Systems,” AltexSoft,

15-Oct-2019.

[Online]. Available:

https://www.altexsoft.com/blog/business/comparing-database-management-systems-mysql-postgresqlmssql-server-mongodb-elasticsearch-and-others/. [Accessed: 26-Jan-2022]. 

Nodes, clusters, and shards in Elasticsearch - S1E3:Mini Beginner's Crash Course. YouTube, 2021.



“Elasticsearch vs RDBMS: Learn top 12 comparison you need to know,” EDUCBA, 01-Mar-2021. [Online]. Available: https://www.educba.com/hadoop-vs-rdbms/. [Accessed: 01-Feb-2022].



“Understanding elasticsearch cluster, node, index and document using example,” JavaInUse. [Online]. Available:

https://www.javainuse.com/elasticsearch/elasticnode#:~:text=1%20Cluster%20is%20a

%20collection%20of%20one%20or,row%20in%20relational%20databases.%20...%20More%20items... %20. [Accessed: 02-Feb-2022]. 

E. van Baaren, “Can Elasticsearch store data?,” Medium, 26-Dec-2019. [Online]. Available: https://medium.com/tech-explained/can-elasticsearch-store-data-4b840308ea6f#:~:text=Storing %20JSON%20data%20in%20Elasticsearch%20By%20default%2C%20Elasticsearch,your%20data %20in%20Elasticsearch%20and%20retrieve%20it%20too. [Accessed: 03-Feb-2022].



D. Horovits, “The Complete Guide to the elk stack,” Logz.io, 01-Nov-2021. [Online]. Available: https://logz.io/learn/complete-guide-elk-stack/#elk-in-production. [Accessed: 03-Feb-2022].

Creating an Elasticsearch Cluster

PAGE 16...