中文分詞的基本原理
The basic principles of Chinese word segmentation
(1)字符串匹配分詞法。
(1) String matching segmentation method.
該分詞法又分爲正向(xiàng)大匹配法、反向(xiàng)大匹配法和短路徑分詞法。
This segmentation method is further divided into forward large matching method, reverse large matching method, and short path segmentation method.
舉個例子:
For example:
“不知道(dào)你在說什麼(me)”:采用正向(xiàng)大匹配法分詞結果是“不知道(dào),你,在,說什麼(me)”。反向(xiàng)大匹配法分詞結果是“不,知道(dào),你在,說,什麼(me)”。短路徑分詞結果是“不知道(dào),你在,說什麼(me)”。
"I don't know what you're saying": The result of using the positive big matching method for word segmentation is "I don't know what you're saying.". The result of the reverse big matching method for word segmentation is "no, I know, you're here, say, what". The result of short path segmentation is "I don't know, what are you saying?".
(2)詞義分詞法。
(2) Semantic segmentation.
這(zhè)種(zhǒng)分詞法其實就是一種(zhǒng)機器判斷分詞方法。原理很簡單,就是先進(jìn)行句法、語義分析,然後(hòu)利用句法信息和語義信息來處理歧義現象從而達到分詞的目的。
This segmentation method is actually a machine judgment segmentation method. The principle is very simple, which is to first perform syntactic and semantic analysis, and then use syntactic and semantic information to handle ambiguity and achieve the goal of word segmentation.
(3)統計分詞法。
(3) Statistical word segmentation.
這(zhè)種(zhǒng)分詞法很簡單,就是根據詞組的統計,根據兩(liǎng)個相鄰的字出現的頻率的多少來确定這(zhè)個詞的重要性以達到分詞的目的。
This segmentation method is very simple, which is to determine the importance of a word based on the frequency of its occurrence, according to the statistics of phrases, in order to achieve the goal of segmentation.
中文分詞的SEO優化方法
SEO optimization methods for Chinese word segmentation
中文分詞是按照關鍵詞的組合進(jìn)行拆分,用戶在搜索某個關鍵詞時,搜索引擎的做法是先返回用戶搜索的整個關鍵詞,然後(hòu)再返回拆分後(hòu)的關鍵詞結果。
Chinese word segmentation is based on the combination of keywords. When a user searches for a certain keyword, the search engine's approach is to first return the entire keyword searched by the user, and then return the split keyword result.
也就是說中文分詞的優化更多的將(jiāng)那些被分隔之後(hòu)多個關鍵詞重新組合成(chéng)另一個可以包含他們的一個新關鍵詞,這(zhè)樣做的原因是:①可以避免關鍵詞堆砌,②增加多個關鍵詞信息,③一個關鍵詞帶有更多的信息量。
That is to say, the optimization of Chinese word segmentation focuses more on recombining multiple separated keywords into a new keyword that can contain them. The reason for doing so is: ① to avoid keyword stacking, ② to increase the information of multiple keywords, and ③ to add more information to one keyword.
中文分詞SEO優化注意事(shì)項
Chinese word segmentation SEO optimization considerations
(1)信息量領域要高度相關。
(1) The field of information content should be highly relevant.
有時候爲了將(jiāng)一個關鍵詞的信息量大限度的挖掘,可能(néng)會進(jìn)行一些錯誤的組合,這(zhè)樣的優化可能(néng)沒(méi)有什麼(me)用,反而對(duì)優化不利。
Sometimes, in order to maximize the information content of a keyword, incorrect combinations may be made, which may not be useful and may be detrimental to optimization.
信息量是達到了想要的數量,但是精準度卻太過(guò)于分散,這(zhè)樣不利于關鍵詞的權重集中。
The amount of information has reached the desired level, but the accuracy is too scattered, which is not conducive to the concentration of keyword weights.
(2)頁面(miàn)關鍵詞和分詞不相關。
(2) The page keywords and segmentation are not related.
在标題的關鍵詞裡(lǐ)面(miàn)分詞做得很優秀,但是頁面(miàn)中卻沒(méi)有相關的分詞,這(zhè)樣對(duì)于其中的某些分詞就不會有什麼(me)效果。
The segmentation in the keywords of the title is excellent, but there are no relevant segmentation on the page, so it will not have much effect on some of the segmentation.
(3)内容優化做精準關鍵詞,避免使用分詞優化。
(3) Optimize content with precise keywords and avoid using segmentation optimization.
一般情況下,我建議在做長(cháng)尾詞優化時避免使用中文分詞,除了首頁、欄目列表和特定的内容聚合專題頁,一般不建議使用分詞。
In general, I suggest avoiding using Chinese word segmentation when optimizing long tail words. Except for the homepage, column list, and specific content aggregation topic pages, it is generally not recommended to use word segmentation.
原因是分詞的優化有難度,對(duì)于一般的編輯或長(cháng)尾詞頁面(miàn),我們應該集中精力去做一個關鍵詞就行,要是涵蓋的信息量太多,就會分散我們想要優化關鍵詞的權重。
The reason is that optimizing word segmentation is difficult. For general editing or long tail word pages, we should focus on creating a keyword. If the amount of information covered is too much, it will scatter the weight of the keywords we want to optimize.
相關推薦