Title | Lab 02 - Guidance and Lab work |
---|---|
Course | Data Analysis |
Institution | Seneca College |
Pages | 16 |
File Size | 1.6 MB |
File Type | |
Total Downloads | 89 |
Total Views | 134 |
Guidance and Lab work - Questions included...
Lab 02: Creating an Elasticsearch Cluster
Mangesh Bhattacharya
2022-01-28 — SRT 411: Digital Data Analysis — Dr. Asma Paracha
Table of Contents Task 1: Create & Test the Cluster....................................................................................................................3 Task 2: Data Query.........................................................................................................................................9 Task 3: Index Management............................................................................................................................13 Task 4: Answer the following Question..........................................................................................................14 Conclusion:..................................................................................................................................................15 Citation........................................................................................................................................................16
Creating an Elasticsearch Cluster
PAGE 2
Task 1: Create & Test the Cluster In Lab 02, we had to create, configure, and test our cluster. We have already installed elasticsearch in our Red hat 8
virtual machine.
To
get the result for
cluster-1
or
master-cluster,
we had
/etc/elasticsearch/elasticsearch.yml and add the following information manually: 1. Node name, master & data:
2. Cluster Name:
3. Network Host:
Creating an Elasticsearch Cluster
PAGE 3
to
edit
4. Discovery: Contains master, machine 1, and machine 2 IP addresses
Secondly, we had to edit and configure Kibana by going into /etc/Kibana/kibana.yml. However, after that step to start kibana, we must use the systemctl command and check the status of the service: 1. /etc/kibana/kibana.yml: edited – server.port, server.host, server.name & elasticsearch.hosts
Creating an Elasticsearch Cluster
PAGE 4
2. sudo systemctl start, status kibana and elasticsearch:
Thirdly, I created two (2) VMs to join the master node or master machine. Here I created two CentOS 9 machines called Machine 1 and Machine 2. Machine 1 IP address is 192.168.13.129 and Machine 2 IP address is 192.168.13.130. After creating these two machines, I installed Elasticsearch and Kibana and then configured kibana and Elasticsearch to connect the machine’s node 1 to the master node and machine node 2. The images below will describe the following: 1. /etc/elasticsearch/elasticsearch.yml – Node 1 – Srt411data1
Creating an Elasticsearch Cluster
PAGE 5
Creating an Elasticsearch Cluster
PAGE 6
2. /etc/kibana/kibana.yml – Node 1 – 192.168.13.129
3. sudo systemctl start, status kibana & elasticsearch
Creating an Elasticsearch Cluster
PAGE 7
Starting the nodes one-by-one, once all three machines are configured and running perfectly after starting the services of each node, i.e., Master, Node 1, and Node 2: Master: Srt411master, 192.168.13.128
Node 1: Srt411data1 – 192.168.13.129
Node 2: Srt411data2 – Static IP – 192.168.13.130 Creating an Elasticsearch Cluster
PAGE 8
In this case, I uninstalled and reinstalled the 2 nodes because they were not connecting whether I stopped the firewall or not. Therefore, after a good recommendation, I went ahead and re-installed my machines:
Task 2: Data Query
Creating an Elasticsearch Cluster
PAGE 9
The difference between the POST and PUT command is that the request methods used in POST cannot be cached or stored, whereas the methods used in PUT can be stored, for example, while sending the request, the sequence number change from 5 to 6 to 7. Refer to the image below: POST method:
The GET command results:
Creating an Elasticsearch Cluster
PAGE 10
The DELETE command result:
Retrieve document 2 of myData result:
Creating an Elasticsearch Cluster
PAGE 11
Creating an Elasticsearch Cluster
PAGE 12
Task 3: Index Management Screenshots of showing the Cluster health and information about the nodes using index management: 1. Cluster State:
2. Cluster Health:
The above screenshot still shows the Cluster Health as yellow; it means either the shards have not been recognized or appropriately loaded. Need to analyze. Creating an Elasticsearch Cluster
PAGE 13
Task 4: Answer the following Question 1. What is a cluster and node in Elasticsearch? a. A cluster is a collection of nodes with the same name attribute, and nodes are a solitary server essential for the cluster. It stores the information and is interested in the cluster’s ordering and search capacities.
2. What are the main differences between master and data nodes? a. The primary nodes are liable for handling the clusters, while the information or data nodes, as the name proposes, are accountable for the information. For example, while installing and configuring the primary and data nodes, it is recommended to install at least two (2) primary nodes and two (2) data nodes that will later require configuration to scale on dependable machines with a secure and swift storage capacity.
3. Which type of nodes holds the documents? a. Master nodes hold most of the documents, for instance, creating, removing indexes, tracing nodes, & distributing shards to nodes.
4. Describe how ES stores data? a. There are two varieties of data sets that need to be stored in ES are: i. ii.
JSON documents contain lists, texts, mapped coordinates, and CSV numbers. Binary data or information
b. If the records are put away remotely, Elasticsearch is utilized for its indexing, picking to deactivate the _source field. Elasticsearch will automatically store each piece of information typed into it, so it works both as a web search and a manuscript store. It can also store binary data.
5. Describe the benefits of JSON objects. a. Certain benefits include detecting the types of data, listing the number of records, and enabling it for web-search.
6. Compare a traditional RDBMS and ES for data storage. a. The relational database can store data & also index that same data. RDBMS is better in reading the currently composed execution. Elasticsearch engines are better at rapid inquiry with extra tricks such as a wide range of standardizations.
Creating an Elasticsearch Cluster
PAGE 14
Conclusion: In Lab 02, we had to create, configure, and test our cluster. We have already installed elasticsearch in our Red hat 8 virtual machine. To get the result for cluster-1 or master-cluster, we had to edit /etc/elasticsearch/elasticsearch. Secondly, we had to edit and configure Kibana by going into /etc/Kibana/kibana. After that step to start kibana, we must use the systemctl command and check the status of the service. However, I created two VMs that can join the master node or master machine. After creating these two machines, I installed Elasticsearch and Kibana and then configured kibana and Elasticsearch to connect the machine's node 1 to the master node and machine node 2. The difference between the POST and PUT command is that the request methods used in POST cannot be cached or stored, whereas the methods used in PUT can be stored, for example, while sending the request, the sequence number change from 5 to 6 to 7. A cluster is a collection of nodes with the same name attribute, and nodes are a solitary server essential for the cluster. It stores the information and is interested in the cluster's ordering and search capacities. The primary nodes are liable for handling the clusters, while the information or data nodes, as the name proposes, are accountable for the information. While installing and configuring the primary and data nodes, it is recommended to install at least two primary nodes and two data nodes that will later require configuration to scale on dependable machines with a secure and swift storage capacity. Master nodes hold most of the documents creating, removing indexes, tracing nodes, & distributing shards to nodes. There are two varieties of data sets that need to be stored in ES: JSON documents, which contain lists, texts, mapped coordinates, CSV numbers, and Binary information. If the records are put away remotely, Elasticsearch is utilized for its indexing, picking to deactivate the source field. Elasticsearch will automatically store each piece of information typed into it, so it works both as a web search and a manuscript store.
Creating an Elasticsearch Cluster
PAGE 15
Citation
“What
is
Elasticsearch?,”
Magedelight
Resources,
30-Dec-2020.
[Online].
Available:
https://www.magedelight.com/resources/what-is-elasticsearch-benefits-of-magento-2elasticsearch/#:~:text=In%20simple%20words%2C%20ElasticSearch%20accepts%20JSON %20documents%2C%20detects,it%20automatically%20gets%20added%20to%20the%20mapping %20definitions. [Accessed: 25-Jan-2022].
“Comparing
Database Management Systems,” AltexSoft,
15-Oct-2019.
[Online]. Available:
https://www.altexsoft.com/blog/business/comparing-database-management-systems-mysql-postgresqlmssql-server-mongodb-elasticsearch-and-others/. [Accessed: 26-Jan-2022].
Nodes, clusters, and shards in Elasticsearch - S1E3:Mini Beginner's Crash Course. YouTube, 2021.
“Elasticsearch vs RDBMS: Learn top 12 comparison you need to know,” EDUCBA, 01-Mar-2021. [Online]. Available: https://www.educba.com/hadoop-vs-rdbms/. [Accessed: 01-Feb-2022].
“Understanding elasticsearch cluster, node, index and document using example,” JavaInUse. [Online]. Available:
https://www.javainuse.com/elasticsearch/elasticnode#:~:text=1%20Cluster%20is%20a
%20collection%20of%20one%20or,row%20in%20relational%20databases.%20...%20More%20items... %20. [Accessed: 02-Feb-2022].
E. van Baaren, “Can Elasticsearch store data?,” Medium, 26-Dec-2019. [Online]. Available: https://medium.com/tech-explained/can-elasticsearch-store-data-4b840308ea6f#:~:text=Storing %20JSON%20data%20in%20Elasticsearch%20By%20default%2C%20Elasticsearch,your%20data %20in%20Elasticsearch%20and%20retrieve%20it%20too. [Accessed: 03-Feb-2022].
D. Horovits, “The Complete Guide to the elk stack,” Logz.io, 01-Nov-2021. [Online]. Available: https://logz.io/learn/complete-guide-elk-stack/#elk-in-production. [Accessed: 03-Feb-2022].
Creating an Elasticsearch Cluster
PAGE 16...