Ақпараттық Технологиялар Кафедрасы СӨЖ “Сlustering task

K-means clustering algorithm

өлшемі146,4 Kb.
1   2   3   4
СРС Бердіғалиев Ақниет ИС 20-11

K-means clustering algorithm
From the English mean — "average value". It consists of four steps.
1. The number of clusters k is set, which should be formed from the objects of the initial sample.
2. K records of the initial sample are randomly selected, which will serve as the initial cluster centers. The starting points from which the cluster then grows are often called "seeds".
Each such record is a kind of "embryo" of a cluster consisting of only one
3. For each record of the initial sample, the cluster center closest to it is determined.
4. The calculation of centroids — the centers of gravity of clusters.

Collecting a semantic core cluster for a website is carried out in several stages:

  1. Adding a website url to the cluster collection tool. If there is no future site yet, you can group without an address: the result will not change. It is necessary to indicate the name of the project, which will greatly simplify navigation.

  2. Loading queries for the SA cluster. You can add information in table format: phrases will be collected from the first sheet, regardless of their order. It is important that there is no third-party data on the first sheet: they will be recorded by the system as requests. If the queries are presented in the original form as a list, you can simply copy and paste them into the desired field.

  3. Selecting a grouping method. One option is precision: the number of URL matches at which the phrase falls into the group is determined. It should be noted that mixing different semantics on one page of the site may cause requests for one of the positions to sharply increase, and for others to decrease.

  4. Getting the result. The cluster collection procedure will take several minutes. In addition, online services allow you to start grouping the semantic core and minimize/close the browser page. Upon completion, a notification will be sent to the specified email address. The report can be downloaded in table format.

The result of collecting the cluster will be presented to the user on several sheets, each of which will indicate key queries for a specific search engine with the selected region and accuracy. Each report necessarily contains sheets with topic leaders for each region and each search engine, as well as in accordance with the initial settings.

Depending on the specified accuracy range, the number of sheets is generated for each level. This is convenient for users who are involved in website promotion: you can immediately compare the results obtained, determining the optimal settings for a specific project, its promotion strategy and advertising campaign.
When deciding to collect a semantic core cluster for a website, you should pay attention to the cost of the procedure. It may vary depending on a number of factors.

  1. Number of search requests processed.

  2. Number of search engines used.

  3. Number of selected regions.

Depending on the volume of semantics and the number of pages on the site being promoted, the optimal method for grouping clusters is determined. Some specialists prefer to combine several options at once, which helps achieve the most accurate result.

Among the current methods for collecting a cluster, you should pay attention to:

Достарыңызбен бөлісу:
1   2   3   4

©emirsaba.org 2024
әкімшілігінің қараңыз

    Басты бет