TWM624658U

TWM624658U - Prediction devices for predicting whether users belong to valuable user groups based on short-term user characteristics

Info

Publication number: TWM624658U
Application number: TW110210571U
Authority: TW
Inventors: 陳昶佑
Original assignee: 伊雲谷數位科技股份有限公司
Priority date: 2021-09-07
Filing date: 2021-09-07
Publication date: 2022-03-21

Abstract

A prediction device for predicting whether a user belongs to a valuable user group based on short-term characteristics of a user executes steps as follows: obtaining a grouping result, wherein the grouping result indicates that at least one user belongs to a value user group; obtaining multiple user characteristics of the value user group, and establishing a first grouping model of the value user group based on the user characteristics, wherein at least one of the user characteristics is a user long-term characteristic of the user; extracting short-term user characteristics from the first grouping model, and establishing a second grouping model of the value user group based on the short-term user characteristics; and using the second grouping model to predict whether another one user belongs to the value user group based on short-term characteristics of the other one user.

Description

A predictive device that predicts whether a user belongs to a valuable user group based on the short-term characteristics of the user. set

本新型涉及一種用於預測用戶是否屬於某一個價值用戶群的預測裝置，且特別是一種以用戶短期特徵預測用戶是否屬於價值用戶群的預測裝置。 The present invention relates to a prediction device for predicting whether a user belongs to a certain value user group, and particularly a prediction device for predicting whether a user belongs to a value user group based on short-term characteristics of the user.

在傳統的用戶管理系統中，往往要等待到每月/季/半年/年的用戶帳單，才得以了解各個用戶對商品或服務提供者(亦即，提供服務或商品的企業)的價值貢獻程度。然而，現在用戶的行為變化迅速，以往等待到每月/季/半年/年的用戶帳單才判斷用戶屬於哪種價值用戶族群的做法，會讓商品或服務提供者迷失於統計數據，而導致無法即時針對當前用戶環境進行用戶服務優化，並促成用戶增加消費的機會。 In the traditional user management system, it is often necessary to wait until the monthly/quarterly/semi-annual/annual user bills to understand the value contribution of each user to the commodity or service provider (that is, the enterprise that provides the service or commodity). degree. However, users' behaviors are changing rapidly. In the past, the practice of waiting until monthly/quarterly/semi-annual/annual user bills to determine which value user group a user belongs to would make the product or service provider lose sight of the statistical data, resulting in It is impossible to instantly optimize user services for the current user environment and promote opportunities for users to increase consumption.

舉例來說，依據每季的帳單，用戶可能是屬於購買金額較高的價值用戶群，但如果等到下一季帳單才查看用戶是否仍屬於購買金額較高的價值用戶群，則有可能已經流失該用戶。簡單地說，若能即時地了解用戶的行為，則比較有機會維持高價值用戶，甚至可能將準高價值用戶趁早變成高價值用戶。 For example, based on quarterly billing, a user may belong to a high-value group with high purchases, but if you wait until the next quarterly bill to see if the user is still in a high-value group, you may have Churn this user. Simply put, if one can instantly understand The behavior of users has a better chance of maintaining high-value users, and may even turn quasi-high-value users into high-value users as soon as possible.

為了解決先前技術的問題，本新型提供一種以用戶短期特徵預測用戶是否屬於價值用戶群的預測方法，係執行於一預測裝置，包括：獲取一分群結果，其中所述分群結果表示至少一用戶屬於一價值用戶群；基於一機器學習算法實現一特徵工程，以獲取所述價值用戶群的多個用戶特徵，並透過所述多個用戶特徵建立所述價值用戶群的一第一分群模型，其中所述多個用戶特徵至少一者為一用戶長期特徵，所述用戶長期特徵為在一目標時間到達才能獲得的一特徵；自第一分群模型中萃取出多個用戶短期特徵，並以所述多個用戶短期特徵建立所述價值用戶群的一第二分群模型，其中所述用戶短期特徵為在所述目標時間到達前便能獲得的一特徵；以及使用所述第二分群模型根據另一用戶的多個用戶短期特徵預測所述另一用戶是否屬於所述價值用戶群。 In order to solve the problems of the prior art, the present invention provides a prediction method for predicting whether a user belongs to a valuable user group based on short-term characteristics of the user. The method is executed in a prediction device, and includes: acquiring a grouping result, wherein the grouping result indicates that at least one user belongs to A value user group; a feature engineering is implemented based on a machine learning algorithm to obtain a plurality of user characteristics of the value user group, and a first grouping model of the value user group is established through the plurality of user characteristics, wherein At least one of the plurality of user features is a long-term feature of the user, and the long-term feature of the user is a feature that can only be obtained after reaching a target time; a plurality of short-term features of the user are extracted from the first clustering model, and the establishing a second grouping model of the valuable user group with short-term characteristics of a plurality of users, wherein the short-term characteristics of the users are a characteristic that can be obtained before the target time arrives; and using the second grouping model according to A plurality of user short-term characteristics of another user predict whether the other user belongs to the value user group.

於本新型實施例中，根據多個用戶在所述目標時間到達時得到的多個用戶資料對所述多個用戶分群，以獲取所述分群結果。 In this novel embodiment, the plurality of users are grouped according to the plurality of user data obtained by the plurality of users when the target time arrives, so as to obtain the grouping result.

於本新型實施例中，所述分群結果是由一RFM模型、或一機器學習算法進行分群。 In this novel embodiment, the grouping results are grouped by an RFM model or a machine learning algorithm.

於本新型實施例中，所述預測方法更包括：根據所述另一用戶被預測是否屬於所述價值用戶群來決定對所述另一用戶的一行銷策略。 In this novel embodiment, the prediction method further includes: determining a marketing strategy for the other user according to whether the other user is predicted to belong to the valuable user group.

於本新型實施例中，挑選所述價值用戶群中多個重要性為前幾名或超過一門限值的多個用戶特徵來做為所述價值用戶群的所述多個用戶特徵。 In the novel embodiment of the present invention, a plurality of user features in the valuable user group whose importance is the top few or exceeding a threshold value are selected as the plurality of user features in the valuable user group.

於本新型實施例中，挑選所述第一分群模型中多個重要性為前幾名或超過一門限值的多個用戶短期特徵來做為所述第二分群模型的所述多個用戶短期特徵。 In the new embodiment of the present invention, a plurality of short-term features of a plurality of users whose importance is the top few or exceeding a threshold value in the first grouping model are selected as the plurality of users of the second grouping model short-term characteristics.

於本新型實施例中，挑選所述第一分群模型中多個重要性為前幾名或超過一門限值的多個用戶短期特徵，以及自挑選出的所述多個用戶短期特徵延伸出多個用戶短期特徵，將挑選出的所述多個用戶短期特徵以及延伸出的所述多個用戶短期特徵做為所述第二分群模型的所述多個用戶短期特徵。 In the new embodiment of the present invention, a plurality of short-term features of users whose importance is the top few or exceeding a threshold value in the first grouping model are selected, and a plurality of short-term features of users are extended from the selected short-term features of users. short-term features of users, and the selected short-term features of users and the extended short-term features of users are used as the short-term features of users in the second grouping model.

於本新型實施例中，所述預測方法更包括：監控預測的所述價值用戶群的多個用戶與所述分群結果的所述價值用戶群的所述多個用戶之間的一交集於時間上的變化，以判斷所述多個用戶的一行銷策略是否正確。 In this novel embodiment, the prediction method further comprises: monitoring an intersection between the predicted multiple users of the value user group and the multiple users of the value user group of the grouping result over time. to determine whether the marketing strategies of the multiple users are correct.

於本新型實施例中，所述預測方法更包括：將預測的所述價值用戶群的多個用戶與所述分群結果的所述價值用戶群的所述多個用戶整合做為所述價值用戶群的所述多個用戶。 In this novel embodiment, the prediction method further includes: integrating multiple users of the predicted value user group with the multiple users of the value user group in the grouping result as the value user the plurality of users of the group.

於本新型實施例中，使用所述第二分群模型根據所述另一用戶的所述多個用戶短期特徵預測所述另一用戶是否屬於所述價值用戶群是透過一特徵比對，且所述特徵比對是透過一類神經網路分類或計算一餘弦相似性來實現。 In this novel embodiment, using the second grouping model to predict whether the other user belongs to the valuable user group according to the multiple short-term characteristics of the other user is through a feature comparison, and The feature comparison is achieved through a class of neural network classification or computing a cosine similarity.

根據本新型的目的，本新型提供一種非揮發性的儲存媒介，儲存有多個程式碼，所述多個程式碼係被一計算機裝置讀取，以執行上述預測方法的其中一者。 According to the purpose of the present invention, the present invention provides a non-volatile storage medium storing a plurality of code codes, which are read by a computer device to execute one of the above prediction methods.

根據本新型的目的，本新型提供一種以用戶短期特徵預測用戶是否屬於價值用戶群的預測裝置，係以一純硬體電路或一計算機裝置搭配一軟體實現，其係組態成多個模組，且多個模組運作以執行上述預測方法的其中一者。 According to the purpose of the present invention, the present invention provides a prediction device for predicting whether a user belongs to a value user group based on the short-term characteristics of the user. , and a plurality of modules operate to perform one of the above prediction methods.

相較於先前技術，本新型可以透過用戶短期特徵，如每日或單次消費行為，對用戶進行用戶價值評分預測(即，預測用戶是否屬於某一個價值用戶群)，故可針對到用戶個人，讓用戶對現行服務的反應更加即時全面，無須等到最終消費結果產生才反應。 Compared with the prior art, the new model can predict the user's value score (ie, predict whether the user belongs to a certain value user group) through the short-term characteristics of the user, such as daily or single consumption behavior, so it can target the individual user. , so that users can respond to the current service more instantly and comprehensively, without waiting for the final consumption result to respond.

1:價值用戶分群系統 1: Value user grouping system

11:平台裝置 11: Platform device

11S:用戶資料集 11S: User data set

111:瀏覽行為 111: Browsing Behavior

112:興趣喜好 112: Interests

113:裝置資訊 113: Device Information

114:網際網路位址(IP)或裝置識別 114: Internet address (IP) or device identification

12:用戶管理系統 12: User management system

12S:用戶資料集 12S: User Data Set

121:人口統計資料 121: Demographics

122:消費紀錄 122:Consumption record

123:帳務紀錄 123: Accounting records

124:互動紀錄 124: Interactive Record

13:公開資料來源裝置 13: Open Source Device

13S:用戶資料集 13S: User Data Set

131:氣候地理 131: Climatic Geography

132:金融經濟 132: Financial Economics

133:人文人口 133: Humanities Population

134:醫療生物 134: Medical Biology

14:資料庫 14:Database

15:預測裝置 15: Prediction device

151:數據處理模組 151: Data processing module

152:機器學習模組 152: Machine Learning Modules

153:模型部署模組 153: Model Deployment Module

154:行銷模組 154: Marketing Module

21:用戶短期特徵集 21: User short-term feature set

211~21M:用戶短期特徵 211~21M: short-term characteristics of users

22:用戶群 22: User group

231~23N:價值用戶群 231~23N: Value user group

S31~S35:步驟 S31~S35: Steps

41:當前RFM分群結果 41: Current RFM clustering results

411:高價值用戶群 411: High-value user group

42:預測的RFM分群結果 42: Predicted RFM clustering results

421:高價值用戶群 421: High-value user group

51:當前的價值用戶群 51: Current value user base

52:預測的價值用戶群 52: Predicted value user base

531~533:交集 531~533: Intersection

圖1是本新型實施例的價值用戶分群系統的方塊圖。 FIG. 1 is a block diagram of a value user grouping system according to an embodiment of the present invention.

圖2是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的概念示意圖。 FIG. 2 is a conceptual schematic diagram of grouping valuable users based on short-term characteristics of users according to the novel embodiment of the present invention.

圖3是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法的流程圖。 FIG. 3 is a flowchart of a method for predicting value user groups based on short-term characteristics of users according to the novel embodiment of the present invention.

圖4是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法或裝置建立的RFM預測分群與當前RFM分群的示意圖。 4 is a schematic diagram of an RFM prediction group and a current RFM group established by a method or apparatus for predicting value user groups based on short-term characteristics of users according to a novel embodiment of the present invention.

圖5A是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法或裝置於第一時間獲得之價值用戶群及當前的價值用戶群的示意圖。 5A is a schematic diagram of the value user group obtained at the first time and the current value user group obtained by the method or device for predicting value user groups based on the short-term characteristics of users according to the novel embodiment of the present invention.

圖5B是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法或裝置於第二時間獲得之價值用戶群及當前的價值用戶群的示意圖。 5B is a schematic diagram of a value user group and a current value user group obtained at a second time by a method or device for predicting value user groups based on short-term characteristics of users according to the novel embodiment of the present invention.

圖5C是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法或裝置於第二時間獲得之價值用戶群及當前的價值用戶群的另一示意圖。 5C is another schematic diagram of the value user group and the current value user group obtained at a second time by the method or device for predicting value user groups based on short-term characteristics of users according to the novel embodiment of the present invention.

下面結合附圖對本新型的較佳實施例進行詳細闡述，以使本新型的優點和特徵能更易於被本領域技術人員理解，藉以對本新型的保護範圍做出更為清楚明確的界定。 The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings, so that the advantages and features of the present invention can be more easily understood by those skilled in the art, and the protection scope of the present invention can be more clearly defined.

本新型透過機器學習方法，透過蒐集並分析用戶資料(包括用戶短期資料、用戶長期資料與用戶非時間相關資料)，以解析用戶行為(對應於包括用戶長期特徵、用戶短期特徵或與用戶非時間相關特徵的用戶特徵)來建立第一分群模型，並透過建立的第一分群模型基於用戶的用戶資料對用戶分群。接著，透過第一分群模型獲取用戶價值評分與用戶之用戶短期特徵之相關性，此相關性可理解為驗證用戶之用戶短期特徵是否對於用戶價值評分有顯著貢獻，若有顯著貢獻，顯示透過用戶短期特徵重新建立的第二分群模型具有解釋力，並可用於根據用戶的用戶短期特徵(代表著短期的用戶行為)來預測用戶價值(即，預測用戶是否屬於某一個價值用戶群，例如高價值用戶群)。例如，用戶每週會登入網頁次數是否跟用戶終身價值有關，或者，用戶每次登入頁面都會在首頁停留5秒以上是否就會比較願意多花錢購買。透過特徵工程由用戶資料(包含用戶短期資料與用戶長期資料)進行用戶行為解析，以萃取出重要且有意義的用戶特徵(包含用戶短期特徵、用戶長期特徵與用戶非時間相關特徵)。例如，以每日/週/月/季/半年/年來切分用戶資料為用戶短期資料與用戶長期資料，不同的商業情境所在意的時間長短不同，如對電商零售來說，每週可能算長期，每小時才算短期，而對金融業來說，每三年可能算長期，每週才算短期。簡單地說，用戶短期資料與用戶長期資料之有關短期與長期的定義會因為不同商業情境而有不同。另外，用戶非時間相關資料則是指與時間無關的資料，例如用戶的戶籍地地址，以及用戶非時間相關特徵則是指與時間無關的特徵。 This new model uses machine learning methods to collect and analyze user data (including user short-term data, user long-term data and user non-time-related data) to analyze user behavior (corresponding to including user long-term characteristics, user short-term characteristics or non-temporal user data). User characteristics of related characteristics) to establish a first grouping model, and group users based on user data of users through the established first grouping model. Then, the correlation between the user value score and the user's short-term user characteristics is obtained through the first clustering model. This correlation can be understood as verifying whether the user's short-term user characteristics have a significant contribution to the user value score. The second clustering model rebuilt by short-term features has explanatory power and can be used according to the user's short-term features (representing short-term features). user behavior in the future) to predict user value (ie, predict whether a user belongs to a certain value user group, such as a high-value user group). For example, whether the number of times users log on to the web page per week is related to the user's lifetime value, or whether users are more willing to spend more if they stay on the homepage for more than 5 seconds each time they log on the page. Through feature engineering, user behavior analysis is performed from user data (including user short-term data and user long-term data) to extract important and meaningful user features (including user short-term features, user long-term features, and user non-time-related features). For example, divide user data into short-term user data and long-term user data on a daily/weekly/monthly/quarterly/semi-annual/yearly basis. Different business scenarios require different lengths of time. For example, for e-commerce retail, weekly In the long term, every hour is short term, whereas for the financial industry, every three years may be long term and weekly is short term. Simply put, the definition of short-term and long-term between short-term user data and long-term user data will be different for different business situations. In addition, the user's non-time-related data refers to time-independent data, such as the user's residence address, and the user's non-time-related features refer to time-independent features.

於本新型中，先運用用戶資料對用戶進行價值用戶的分群，將每一個用戶分派到多個價值用戶群的其中一個。以一分群或分類模型(例如，RFM模型(Recency Frequency Monetary Model，即以近一次消費、消費頻率、消費金額等數值分群的分類模型)來說，用戶可能被分群到例如高價值(VIP)用戶群、高消費用戶群、常購用戶群、近期高活躍用戶群或其他價值用戶群。接著，建立某一個價值用戶群(例如，有興趣的是高價值用戶群)的第一分群模型。之後，進行特徵工程，使用第一分群模型找出對此價值用戶群來說是重要且有意義的用戶短期特徵，並以這些重要且有意義的用戶短期特徵建模僅關聯於用戶短期行為的第二分群模型。接著，基於第二分群模型，根據當前用戶之用戶短期特徵預測出用戶中長期顧客價值評分，即使用第二分群模型根據當前用戶之用戶短期特徵預測用戶是否屬於第二分群模型的價值用戶群。 In the present invention, users are firstly grouped into valuable user groups by using user data, and each user is assigned to one of a plurality of valuable user groups. In terms of a grouping or classification model (for example, an RFM model (Recency Frequency Monetary Model, that is, a classification model that groups values such as recent consumption, consumption frequency, consumption amount, etc.), users may be grouped into, for example, high-value (VIP) user groups , high consumption user group, frequent purchase user group, recent high active user group or other value user group. Next, establish the first grouping model of a certain value user group (for example, the high value user group who are interested). After that, Perform feature engineering, use the first clustering model to find important and meaningful short-term user characteristics for this value user group, and use these important and meaningful short-term user characteristics to model the second clustering that is only related to the user's short-term behavior Then, based on the second clustering model, according to the short-term user characteristics of the current user, predict the user's mid- and long-term customer value evaluation The second grouping model is used to predict whether the user belongs to the valuable user group of the second grouping model according to the short-term user characteristics of the current user.

上述使用第二分群模型根據當前用戶短期特徵預測用戶是否屬於第二分群模型的價值用戶群是透過特徵比對進行，第二分群模型可以是透過分類算法或迴歸算法所得到之分群模型或其他機器學習算法訓練好的類神經網路。另外，特徵比對也可以是透過計算用戶之多個用戶短期特徵的數值與第二分群模型之用戶短期特徵的數值之間的相關性，如可計算餘弦相似性(cosine similarity)，便可以預測用戶是否屬於第二分群模型的價值用戶群。 The above-mentioned use of the second grouping model to predict whether the user belongs to the valuable user group of the second grouping model according to the short-term characteristics of the current user is carried out through feature comparison. The second grouping model can be a grouping model obtained by a classification algorithm or a regression algorithm. Or neural-like networks trained by other machine learning algorithms. In addition, the feature comparison can also be performed by calculating the correlation between the values of the short-term features of the users and the values of the short-term features of the users in the second clustering model. For example, the cosine similarity can be calculated. Predict whether the user belongs to the value user group of the second grouping model.

簡單地說，不論是使用已定義的RFM模型、第一分群模型或是第二分群模型，都是可以用於對用戶的用戶價值進行評分，RFM模型與第一分群模型可利用用戶長期資料與用戶長期特徵進行分群，在商業競爭激烈的現況下，RFM模型與第一分群模型可能要等到有用戶長期資料才能對用戶做有效且精準行銷，故可能會讓有價值的用戶流失。因此，本新型可使用第二分群模型依據用戶短期資料(即依據用戶短期特徵)提早對用戶的價值評分，以對用戶做有效且精準行銷，增加用戶的消費金額並有效地避免有價值的用戶流失。 Simply put, whether the defined RFM model, the first grouping model or the second grouping model is used, it can be used to score the user value of users. The RFM model and the first grouping model can use long-term user data. Grouping with the long-term characteristics of users. Under the current situation of fierce business competition, the RFM model and the first grouping model may not be able to perform effective and accurate marketing to users until there are long-term user data, so valuable users may be lost. Therefore, the present invention can use the second grouping model to score the user's value in advance according to the user's short-term data (that is, according to the user's short-term characteristics), so as to carry out effective and accurate marketing to the user, increase the consumption amount of the user and effectively avoid valuable User churn.

對價值用戶群來說是重要有意義的用戶特徵的意思可指重要性為前幾名或重要性超過一定門限值的用戶特徵，例如，購買金額對高價值用戶群為重要且有意義的用戶特徵，但是使用現金或信用卡進行消費的用戶特徵可能對高價值用戶群來說不重要且無意義。在建立分群模型時，可定義出各種用戶特徵，以在進行分群模型評估時，可以了解分群模型是否具有解釋力(即分群的精確度是否足夠)。如果解釋力不夠，則可進一步地增加用戶特徵。然而，通常一個分群模型中，有些用戶特徵對分群模型的價值用戶群來說是不重要且無意義的。 The meaning of user features that are important and meaningful to valuable user groups can refer to user features whose importance is the top few or whose importance exceeds a certain threshold. For example, the purchase amount is an important and meaningful user feature for high-value user groups. But user characteristics for spending with cash or credit cards may be unimportant and meaningless to high-value user groups. When building a swarming model, various user characteristics can be defined so that when evaluating a swarming model, you can understand whether the swarming model is Whether it has explanatory power (that is, whether the accuracy of the clustering is sufficient). If the explanatory power is not enough, the user features can be further increased. However, usually in a segmentation model, some user characteristics are unimportant and meaningless to the value user group of the segmentation model.

使用分群模型對用戶進行預測的意思可指對用戶的用戶價值評分，即進行特徵比對，以預測用戶是否屬於分群模型的價值用戶群。特徵比對可指計算用戶之用戶特徵的數值與分群模型的價值用戶群的用戶特徵的數值之間的相似性，也可指所計算出相似性的機率結果。當然，特徵比對的作法有非常多種，且本新型不以類神經網路計算出的機率結果或以計算餘弦相似性的作法為限制，其他以支持向量機或決策樹實現的特徵比對作法也在本新型的範圍之內。 The meaning of using the grouping model to predict users may refer to scoring the user's user value, that is, performing feature comparison to predict whether the user belongs to the value user group of the grouping model. The feature comparison may refer to calculating the similarity between the value of the user feature of the user and the value of the user feature of the value user group of the grouping model, or may refer to the calculated probability result of the similarity. Of course, there are many methods of feature comparison, and the new model is not limited by the probability results calculated by the neural network or the method of calculating cosine similarity, and other methods of feature comparison implemented by support vector machines or decision trees. also within the scope of the present invention.

於本新型中，短期的定義可指比想知道的答案短就叫短期，例如想知道哪些用戶在月底貢獻度較大，若使用一週、兩週或三週的用戶特徵進行預測，那麼一週、兩週或三週相較一個月來說就是短期，也就是比目標期間短的期間都可叫短期，目標期間則叫長期。需要說明的是，有些特徵是與時間無關者，通常不會是用戶長期特徵或用戶短期特徵，而是用戶非時間相關特徵，例如若將用戶的戶籍地地址作為用戶特徵之一，則用戶的戶籍地地址不是用戶短期特徵也不是用戶長期特徵。用戶短期特徵專指有相對時間概念的特徵。 In this new model, the definition of short-term can refer to the short-term answer that is shorter than the answer you want to know. For example, you want to know which users contribute more at the end of the month. Two weeks or three weeks are short-term compared to one month, that is, a period shorter than the target period can be called short-term, and the target period is called long-term. It should be noted that some features are not related to time, usually not long-term features of users or short-term features of users, but non-time-related features of users. Household address is neither a short-term feature nor a long-term feature of the user. User short-term features specifically refer to features with a relative time concept.

於本新型中，用戶資料通常是原始資料，而不會是用戶特徵，除非一開始設計資料庫儲存用戶特徵作為用戶資料，例如資料庫一開始就設計成會計算用戶來店次數與每週消費平均，並儲存用戶來店次數與每週消費平均作為用戶資料使用。特徵是具有代表性的資料，可理解為輸入參數或變數，有意義的輸入參數或變數會影響到最終的用戶價值評分，例如用戶的瀏覽行為特徵，可能是進到網頁後常常點擊廣告內容、或者喜歡觀看影片、或者停留在網頁的時間不超過10秒等等，都可能是有意義的特徵。 In this new model, user data is usually original data, not user features, unless the database is initially designed to store user features as user data. Average, and store the number of user visits to the store and the average weekly consumption as user data. Features are representative data, which can be understood as input parameters. Meaningful input parameters or variables will affect the final user value score, such as the user's browsing behavior characteristics, which may be that they often click on advertisement content after entering the web page, or like to watch videos, or stay on the web page for no longer than 10 seconds, etc., could be meaningful features.

常見的特徵，從類型上可分為數值類型、類別類型與時間類型。此三類型的特徵可以彼此轉換，例如，數值類型的特徵可以進行處理，而轉換成時間類型的特徵。數值類型的特徵是利用統計描述方法，將整體資料範圍切分成數分。類別類型的特徵是指屬於何種類別，例如，裝置資訊可能是筆電、手機或平板，而裝置為筆電、手機與平板的哪一者即是類別類型的特徵。另外，數值類型的特徵也能轉換成類別類型的特徵，或者類別類型的特徵也可以依照各類別比重作特徵加權，以轉換成數值類型的特徵。時間類型的特徵可以是將數值特徵依照每週/每月/每季等方法計算頻率、次數或大小等，其計算方法非常多種，且不以本新型所述的作法為限制。簡單地說，三種類型之特徵的任一者都可以經過處理，以生成其中另一者的特徵。 Common features can be divided into numerical type, category type and time type in terms of type. These three types of features can be converted to each other, for example, the features of the numerical type can be processed and converted into the features of the time type. The characteristic of the numerical type is to use the statistical description method to divide the overall data range into scores. The feature of the category type refers to what category it belongs to, for example, the device information may be a laptop, a mobile phone or a tablet, and which one of the device is a laptop, a mobile phone and a tablet is the feature of the category type. In addition, the features of the numerical type can also be converted into the features of the category type, or the features of the category type can also be weighted according to the weight of each category, so as to be converted into the features of the numerical type. The time-type feature may be the frequency, number of times, or the size of the numerical feature based on weekly/monthly/quarterly calculation methods, etc. There are many calculation methods, and the method described in the present invention is not limited. Simply put, any of the three types of features can be processed to generate features of the other.

本新型使用用戶短期特徵的目的是希望利用短期間特徵參數來預測長時間的結果，用戶短期特徵雖與時間相關，但因為特徵類型可以互相轉換，故廣義上來說用戶短期特徵可包含前述數值類型、類別類型與時間類型之特徵，但可以是以相較於目標時間(長期)的較短時間(短期)切分出來的特徵。用戶短期特徵是相對的概念，例如一般可能要等到月底才知道用戶當月的最終花費，但從用戶每週的消費紀錄、頻率、週期、來店次數、瀏覽頻率、點擊次數與頻率、是否有加入購物車或願望清單，以及用戶過去的歷史紀錄等等，都有很大的機會預測出用戶月底的購買紀錄。找出重要的用戶短期特徵後，便可據此建立自動化流程，實際使用時則是自動化從用戶資料轉換與萃取出用戶短期特徵。 The purpose of using user short-term features in this new model is to use short-term feature parameters to predict long-term results. Although user short-term features are related to time, because feature types can be converted to each other, in a broad sense, user short-term features can include the aforementioned numerical types , the features of category type and time type, but the features can be segmented by a shorter time (short-term) than the target time (long-term). The short-term characteristics of users are relative concepts. For example, it is generally possible to wait until the end of the month to know the final expenditure of the user in the current month, but from the user's weekly consumption records, frequency, cycle, number of visits, browsing frequency, clicks and frequency, and whether there is a subscription. Shopping carts or wish lists, as well as the user's past history, etc., have a good chance of predicting the user's purchase history at the end of the month. After identifying the important short-term characteristics of users, an automated process can be established accordingly. In actual use, it is to automatically convert and extract short-term user characteristics from user data.

於本新型中，延伸出來的特徵可指原本不存在的特徵，延伸特徵可指用既有特徵或既有資料進行計算、轉換或擷取出來的特徵。過於細膩的資料可能難以直接作為特徵使用，因此，可對資料進行轉換產生例如，一般地址資料可能會包含許多資訊，例如總統府地址為：「106台北市大安區新生南路二段1號」，若直接以這項資料來進行建模，將會因為資料太過細緻而無法運用，這時會做一些特徵轉換處理，可能將原始的「地址：106台北市大安區新生南路二段1號」轉變成「郵遞區號：106、城市：台北市、鄉鎮市區大安區、其他地址：新生南路二段1號」或可能透過經緯度查詢獲得較粗略定位資訊的「經緯度：25N/121E」。延伸的特徵，以用於模型的訓練與處理。又舉一例：一既有特徵為「每28日登入頻率=10」，則可以延伸出「每週平均登入頻率=2」或者「每雙週平均登入頻率=4」的延伸特徵。又如一既有或原始資料為「ID：0001；2020-9-1 10：00：05；登入首頁；使用裝置device=iphone12」，可延伸出與「使用者識別碼」、「登入時間」、「登入次數」、「使用裝置」相關之多個延伸特徵，其中例如「登入時間」之特徵可包含多個以不同時段、週期等計算之延伸特徵，以及「使用裝置」之特徵可包含與「裝置品牌」及「裝置型號」相關之延伸特徵等。 In the present invention, an extended feature may refer to a feature that does not exist originally, and an extended feature may refer to a feature calculated, converted or extracted using existing features or existing data. Data that is too detailed may be difficult to use directly as a feature. Therefore, the data can be converted to produce. For example, general address data may contain a lot of information, such as the address of the Presidential Office: "106 No. 1, Section 2, Xinsheng South Road, Daan District, Taipei City", If the data is directly used for modeling, the data will be too detailed to be used. At this time, some feature conversion processing will be performed, which may convert the original "Address: No. 1, Section 2, Xinsheng South Road, Daan District, Taipei City, 106". "Postal Code: 106, City: Taipei City, Daan District, Township Urban District, Other Address: No. 1, Section 2, Xinsheng South Road" or "latitude and longitude: 25N/121E" which may obtain rough positioning information through longitude and latitude query. Extended features for model training and processing. Another example: if the existing feature is "login frequency every 28 days = 10", the extended feature of "average login frequency per week = 2" or "average login frequency per biweekly = 4" can be extended. Another example is that the existing or original data is "ID: 0001; 2020-9-1 10:00:05; log in to the homepage; use device device=iphone12", which can be extended with "user ID", "login time", Multiple extended features related to "login times" and "use device", wherein, for example, the feature of "login time" can include multiple extended features calculated in different time periods and periods, and the feature of "use device" can include the feature related to " Device brand” and “device model” related extension features, etc.

另舉一例說明延伸特徵的例子，假設在數值迴歸應用中，評估一個晶片效能的模型的既有特徵有電阻R與電流I，若有相關領域的技術人員提出功率P也是一個重要且有意義的特徵，此時可以延伸出一個功率 P=I²R的特徵作為評估一個晶片效能的模型的特徵。又舉一例：網路行銷的情境中，總曝光次數I及總點擊次數TC為既有特徵，可延伸出點閱率CTR=TC/I的延伸特徵作為評估廣告、關鍵字等的效能。再舉一例：一用戶資料包括「ID：0001；2020-9-1 10：00：05；登入首頁」、「ID：0001；2020-9-1 10：00：15；離開首頁」、「ID：0001；2020-9-5 14：08：30；登入首頁」、「ID：0001；2020-9-5 14：08：30；消費；消費金額=1,000」與「ID：0001；2020-9-5 14：08：50；離開首頁」，依據此原始數據中所記錄之造訪狀況，即對登入/離開首頁之時間進行分析，延伸特徵可包含「七天內登入首頁次數」、「日均登入首頁次數」、「九月登入首頁次數」、「九月第一週登入首頁次數」、「日間時段造訪頻率」、「每次登入停留時間」、「工作階段(session)長度」等。簡單地說，在本新型中，也會類似上述進行特徵延伸的作法，在既有的用戶資料或已預先定義的用戶特徵，再延伸出有對分群模型來說重要且有意義的特徵。 Another example is given to illustrate the extended feature. Suppose that in a numerical regression application, the existing features of a model for evaluating a chip's performance include resistance R and current I. If someone skilled in the relevant field proposes that power P is also an important and meaningful feature , at this time, a feature of power P=I ² R can be extended as a feature of a model for evaluating the performance of a chip. Another example: in the context of Internet marketing, the total number of impressions I and the total number of clicks TC are existing features, and the extended feature of CTR=TC/I can be extended to evaluate the effectiveness of advertisements, keywords, etc. Another example: a user data includes "ID: 0001; 2020-9-1 10: 00: 05; log in to the home page", "ID: 0001; 2020-9-1 10: 00: 15; leave the home page", "ID : 0001; 2020-9-5 14: 08: 30; log in to the home page", "ID: 0001; 2020-9-5 14: 08: 30; consumption; consumption amount = 1,000" and "ID: 0001; 2020-9 -5 14:08:50; leave the homepage", according to the visit status recorded in the original data, that is to analyze the time of logging in/leaving the homepage. The number of homepages, the number of homepage logins in September, the number of homepage logins in the first week of September, the frequency of visits during the day, the duration of each login, and the length of the session. In short, in the present invention, similar to the above-mentioned method of feature extension, the existing user data or predefined user features are extended to have important and meaningful features for the grouping model.

請參閱圖1，圖1是本新型實施例的價值用戶分群系統的方塊圖。價值用戶分群系統1係搭配用戶服務系統(圖未繪示)使用，用戶服務系統用於提供服務給用戶，而價值用戶分群系統1則用於根據服務的用戶之用戶資料來對用戶進行分群，以預測用戶是否屬於某一個價值用戶群，從而決定行銷策略，增加交易與用戶的滿意度。上述提供的服務例如是線上或實體購物、線上或實體交易、線上或實體投資、線上或實體訂閱或線上或實體諮詢等，亦即，用戶服務系統可以包括線上的服務主機或實體店面的銷售點終端(POS)，且本新型不以此為限制。換言之，價值用戶分群系統1可以對使用線上服務或實體服務的用戶進行價值用戶的分群。 Please refer to FIG. 1 , which is a block diagram of a value user grouping system according to a new embodiment of the present invention. The value user grouping system 1 is used in conjunction with the user service system (not shown in the figure), the user service system is used to provide services to users, and the value user grouping system 1 is used to group users according to the user information of the service users. To predict whether the user belongs to a certain value user group, so as to determine the marketing strategy and increase the satisfaction of transactions and users. The services provided above are, for example, online or physical shopping, online or physical transactions, online or physical investments, online or physical subscriptions, or online or physical consultations, etc. That is, the user service system may include online service hosts or physical storefront sales points Terminal (POS), and the present invention is not limited by this. In other words, the valuable user grouping system 1 can group valuable users for users who use online services or physical services.

價值用戶分群系統1包括平台裝置11、用戶管理系統12、公開資料來源裝置13、資料庫14與以用戶短期特徵預測用戶是否屬於價值用戶群的預測裝置15，其中資料庫14信號連接平台裝置11、用戶管理系統12、公開資料來源裝置13與預測裝置15。另外，本新型中的「信號連接」是指透過軟體或硬體方式的無線或有線連接，使得信號或資訊可以彼此在信號連接的多個元件中互相傳遞。 The value user grouping system 1 includes a platform device 11 , a user management system 12 , a public data source device 13 , a database 14 , and a prediction device 15 for predicting whether a user belongs to a value user group based on short-term characteristics of users, wherein the database 14 is connected to the platform device 11 by signal , a user management system 12 , a public data source device 13 and a prediction device 15 . In addition, the "signal connection" in the present invention refers to a wireless or wired connection through software or hardware, so that signals or information can be transmitted to each other in a plurality of components that are signal-connected.

平台裝置11是企業平台或其他類型的平台，其用以獲取用戶的瀏覽行為111、興趣喜好112、裝置資訊113、網際網路位址(IP)或裝置識別(device identification)114之用戶資料。用戶管理系統12用於獲取用戶的人口統計資料121、消費紀錄122、帳務紀錄123與互動紀錄124之用戶資料。公開資料來源裝置13用於獲取用戶的氣候地理131、金融經濟132、人文人口133與醫療生物134之用戶資料。平台裝置11、用戶管理系統12與公開資料來源裝置13獲取的用戶的多個用戶資料可以被資料庫14所儲存，資料庫14可以是非結構化資料庫，但本新型不以此為限制，亦可能是結構化資料庫。 The platform device 11 is an enterprise platform or other type of platform, which is used to obtain user information of user browsing behavior 111 , interests and preferences 112 , device information 113 , Internet address (IP) or device identification 114 . The user management system 12 is used to obtain user data of the user's demographic data 121 , consumption records 122 , account records 123 and interaction records 124 . The public data source device 13 is used to obtain user data of the user's climate geography 131 , financial economy 132 , human population 133 and medical biology 134 . The multiple user data of the user obtained by the platform device 11, the user management system 12 and the public data source device 13 can be stored in the database 14, and the database 14 can be an unstructured database, but the present invention is not limited to this, and also Possibly a structured repository.

預測裝置15通常是透過軟體配合計算機裝置的方式來實現。例如，計算機裝置包括運算單元、儲存單元與通訊單元，其中運算單元電性連接儲存單元與通訊單元。運算單元會讀取非揮發性儲存媒介所儲存的多個程式碼，多個程式碼被執行後，能夠運行以用戶的用戶短期特徵來預測用戶的價值評分的方法(即，以用戶的用戶短期特徵來預測用戶是否屬於某一個價值用戶群)。另外，預測裝置15也可以透過硬體電路來實現，例如，透過設計Verilog或VHDL碼對場可程式化陣列(FPGA)晶片燒錄，以實現純硬體電路的預測裝置15，又例如，透過製作特定應用晶片(ASIC)來實現純硬體電路的預測裝置15。總而言之，預測裝置15是使用軟體還是硬體來實現，其皆非用於限制本新型。 The prediction device 15 is usually realized by means of software and a computer device. For example, the computer device includes an operation unit, a storage unit and a communication unit, wherein the operation unit is electrically connected to the storage unit and the communication unit. The computing unit will read multiple code codes stored in the non-volatile storage medium, and after the multiple code codes are executed, the method of predicting the user's value score based on the user's short-term characteristics of the user (that is, using the user's short-term user characteristics to predict the user's value score) can be executed. feature to predict whether a user belongs to a certain value user group). In addition, the prediction device 15 can also be implemented through a hardware circuit, for example, by designing Verilog or VHDL code to program a Field Programmable Array (FPGA) chip to realize the prediction device 15 of a pure hardware circuit, or, for example, through Fabrication of Application Specific Chips (ASICs) to achieve pure The prediction device 15 of the hardware circuit. All in all, whether the prediction device 15 is implemented using software or hardware is not intended to limit the present invention.

不管預測裝置15使用軟體或硬體方式來實現，其可以劃分出多個模組如圖1。於圖1中，預測裝置15包括數據處理模組151、機器學習模組152、模型部署模組153與行銷模組154，數據處理模組151、機器學習模組152、模型部署模組153與行銷模組154除了彼此信號連接之外，還信號連接於資料庫14。當然，圖1之預測裝置15的模組劃分僅是實現本新型的其中一種方式，且圖1的做法並非用於限制本新型。另外，數據處理模組151、機器學習模組152、模型部署模組153與行銷模組154將與本新型圖3之以用戶短期特徵對用戶進行價值用戶分群的預測方法一併做說明，此處先不贅述。 Regardless of whether the prediction device 15 is implemented by software or hardware, it can be divided into multiple modules as shown in FIG. 1 . In FIG. 1, the prediction device 15 includes a data processing module 151, a machine learning module 152, a model deployment module 153, and a marketing module 154. The data processing module 151, the machine learning module 152, the model deployment module 153 and the In addition to being signal-connected to each other, the marketing modules 154 are also signal-connected to the database 14 . Of course, the module division of the prediction device 15 in FIG. 1 is only one way to realize the present invention, and the method in FIG. 1 is not intended to limit the present invention. In addition, the data processing module 151 , the machine learning module 152 , the model deployment module 153 and the marketing module 154 will be described together with the method for predicting value user groups based on the short-term characteristics of users in FIG. 3 of the present invention. I won't go into details first.

在說明本新型之預測裝置15與圖3之預測方法的細節之前，請先參照本新型圖2，圖2是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的概念示意圖。在圖2中，多個用戶資料中的人口統計資料121、消費紀錄122、帳務紀錄123與互動紀錄124的用戶資料集12S被用來對多個用戶進行價值用戶的分群，每一個用戶會被分群到多個價值用戶群231、232、...、23N的其中一者，其中N為大於等於2的正整數。價值用戶的分群方式例如可以是採用RFM模型，也可是使用其他模型之分群方式，例如以靜態特徵(年齡、性別、職業、註冊時間點)分群，或以動態特徵(RFM參數、會員等級或晉升狀況、帳號餘額、所採用之商品之取貨/交付方式)分群，且本新型不以此為限制。 Before explaining the details of the prediction device 15 of the present invention and the prediction method of FIG. 3 , please refer to FIG. 2 of the present invention. FIG. 2 is a conceptual schematic diagram of grouping valuable users based on short-term user characteristics according to the present invention. In FIG. 2 , the user data set 12S of the demographic data 121 , the consumption record 122 , the account record 123 and the interaction record 124 among the plurality of user data is used to group the value users of the plurality of users, and each user will be grouped into one of the multiple value user groups 231, 232, . . . , 23N, where N is a positive integer greater than or equal to 2. The grouping method of valuable users can be, for example, the RFM model, or the grouping method using other models, such as grouping by static characteristics (age, gender, occupation, registration time point), or by dynamic characteristics (RFM parameters, membership level or promotion. Status, account balance, and the method of pickup/delivery of the goods used) are grouped, and this new model is not limited by this.

換句話說，於圖2中的右邊，可使用RFM模型先對多個用戶進行價值用戶的分群，其中RFM模型必須基於用戶資料中的用戶資料集12S所產生的用戶資料才能夠進行價值用戶的分群。用戶資料集12S的用戶資料多數為用戶長期資料，因此，RFM模型至少要有用戶長期資料才能夠進行價值用戶的分群。另外，用戶資料集12S的用戶資料也可能有用戶短期資料與用戶非時間相關資料。 In other words, on the right side of Figure 2, the RFM model can be used to first group multiple users with valuable users, wherein the RFM model must be based on the user data generated by the user data set 12S in the user data before the value users can be classified. grouping. Users of User Profile 12S Most of the data are long-term user data. Therefore, the RFM model needs at least long-term user data to be able to group valuable users. In addition, the user data of the user data set 12S may also include user short-term data and user non-time-related data.

如前面所述，本新型的目的是希望能夠使用相較於用戶長期特徵之統計時間來得短的用戶短期特徵來提前知悉用戶群22的各用戶是否可能屬於多個價值用戶群231~23N的某一個群。因此，透過用戶資料集11S中的瀏覽行為111、興趣喜好112、裝置資訊113、網際網路位址或裝置識別114的用戶資料與用戶資料集13S中的氣候地理131、金融經濟132、人文人口133、醫療生物134的用戶資料可以產生出包括多個用戶短期特徵211~21M的用戶短期特徵集21。 As mentioned above, the purpose of this new model is to use short-term user characteristics that are shorter than the statistical time of users' long-term characteristics to know in advance whether each user of the user group 22 may belong to a certain member of multiple valuable user groups 231-23N. a group. Therefore, through the user data of browsing behavior 111, interests 112, device information 113, Internet address or device identification 114 in the user data set 11S and climate geography 131, financial economy 132, human population in the user data set 13S 133. A user short-term feature set 21 including a plurality of user short-term features 211-21M can be generated from the user profile of the medical biological device 134.

為了使得可以使用用戶短期特徵集21中的至少一部分用戶短期特徵211~21M來預測用戶是否屬於多個價值用戶群231~23N的某一個群，本新型是將價值用戶群231~23N中有興趣的群(也可能全部都有興趣)建立對應的第一分群模型，並對第一分群模型進行特徵工程，找出用戶短期特徵建立第二分群模型，以使用第二分群模型根據用戶的用戶短期特徵預測用戶是否屬於價值用戶群231~23N的某一個群。例如，針對價值用戶群231，找出對價值用戶群231來說是重要且有意義的用戶特徵(包括用戶長期特徵、用戶短期特徵與用戶非時間相關特徵)，然後進行特徵工程，找出第一分群模型中的用戶短期特徵，並使用這些用戶短期特徵建立價值用戶群231的第二分群模型，之後，透過使用價值用戶群231的第二分群模型根據用戶的用戶短期特徵預測用戶是否屬於價值用戶群231。 In order to make it possible to use at least a part of the short-term user features 211-21M in the user short-term feature set 21 to predict whether the user belongs to a certain group of the multiple value user groups 231-23N, the present invention uses the value user groups 231-23N interested in (maybe all of them are interested) to establish a corresponding first grouping model, and perform feature engineering on the first grouping model to find out the short-term characteristics of users to establish a second grouping model, so as to use the second The user's short-term characteristics predict whether the user belongs to a certain group of the value user groups 231-23N. For example, for the value user group 231, find out the user characteristics that are important and meaningful to the value user group 231 (including user long-term characteristics, user short-term characteristics and user non-time-related characteristics), and then perform feature engineering to find the first The short-term characteristics of users in the grouping model, and use these short-term characteristics of users to establish a second grouping model of the value user group 231, and then use the second grouping model of the value user group 231 to predict whether the user belongs to Value User Group 231.

請接著參照本新型的圖1與圖3，圖3是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法的流程圖。首先，在步驟S31中，數據處理模組151用於獲取分群結果，其中分群結果表示至少一個價值用戶群所包括的用戶。進一步地說，數據處理模組151可以收集用戶資料(至少包括用戶長期資料)，並根據用戶資料來將用戶分類至對應的價值用戶群。例如，透過RFM模型，即可以根據用戶資料將用戶分類為高價值(VIP)用戶群、高消費用戶群、常購用戶群、近期高活躍用戶群或其他價值用戶群。另外，步驟S31也可以是由數據處理模組151對用戶資料處理，以獲取用戶特徵(至少包括用戶長期特徵)，且基於用戶特徵使用預設的分群模型來獲取分群結果。簡單地說，數據處理模組151取得分群結果的做法並非用於限制本新型。甚至，在本新型其他實施例中，分群可以不是由數據處理模組151進行，而是數據處理模組151僅接收分群結果。 Next, please refer to FIG. 1 and FIG. 3 of the present invention. FIG. 3 is a flowchart of a method for predicting value user groups for users based on short-term characteristics of users according to an embodiment of the present invention. First, in step S31, the data processing module 151 is used to obtain a grouping result, wherein the grouping result represents at least one The users included in the value user group. Further, the data processing module 151 can collect user data (including at least long-term user data), and classify the users into corresponding valuable user groups according to the user data. For example, through the RFM model, users can be classified into high-value (VIP) user groups, high-consumption user groups, frequent purchase users, recent high-active users or other valuable user groups based on user profiles. In addition, in step S31, the data processing module 151 may also process the user data to obtain user characteristics (including at least long-term user characteristics), and use a preset grouping model to obtain grouping results based on the user characteristics. Simply put, the method of obtaining the grouping result by the data processing module 151 is not intended to limit the present invention. Even, in other embodiments of the present invention, the grouping may not be performed by the data processing module 151, but the data processing module 151 only receives the grouping result.

以一般電商常用的會員系統的情況，每位用戶在註冊後都會獲得一個使用者帳號，未來用戶在用戶服務系統中的所有行為活動都會記錄在資料庫14中，例如用戶何時消費、消費多少錢、消費的時間、地點與品項、消費前點擊了哪些頁面、放了多少品項進購物車與實際購買了多少品項等。數據處理模組151取得這些資料後，並於月底進行統計後，便能透過各種分群方式(例如，前述RFM模型，但也可以是以商品愛好分群或活動參與分群的作法)，獲得分群結果。以RFM模型為例，RFM模型是根據消費總額、消費頻率與最近一次消費來將用戶分類為屬於高價值(VIP)用戶群、高消費用戶群、常購用戶群、近期高活躍用戶群或其他價值用戶群。 In the case of the membership system commonly used in general e-commerce, each user will obtain a user account after registration, and all future behaviors of the user in the user service system will be recorded in the database 14, such as when the user consumes and how much he consumes. Money, time, place and items of consumption, which pages were clicked on before consumption, how many items were put into the shopping cart and how many items were actually purchased, etc. After the data processing module 151 obtains these data and collects statistics at the end of the month, it can obtain the grouping results through various grouping methods (for example, the aforementioned RFM model, but it can also be grouped by product hobbies or activity participation). Taking the RFM model as an example, the RFM model classifies users into high-value (VIP) user groups, high-consumption user groups, frequent purchase users, recent high-active users, or other users according to the total consumption, consumption frequency and the last consumption. value user group.

在步驟S32中，透過數據處理模組151與機器學習模組152建立價值用戶群的第一分群模型。根據分群結果，可以知道某一個價值用戶群的用戶有哪些，數據處理模組151可以處理此價值用戶群之用戶的用戶資料，根據系統預設或人為設定的規則，從用戶資料中獲取或辨識出多個用戶特徵(至少包括用戶長期特徵，但也有可能更包括用戶短期特徵與用戶非時間相關特徵)。接著，機器學習模組152根據數據處理模組151所獲取的用戶特徵建立價值用戶群的第一分群模型。 In step S32, a first grouping model of the valuable user group is established through the data processing module 151 and the machine learning module 152. According to the grouping result, it is possible to know who are the users of a certain valuable user group. The data processing module 151 can process the user data of the users of this valuable user group, and obtain or identify from the user data according to the system preset or artificially set rules. multiple user characteristics (including at least long-term user characteristics, but may also include user short-term characteristics and user non- time-dependent features). Next, the machine learning module 152 establishes a first grouping model of the valuable user group according to the user characteristics obtained by the data processing module 151 .

選擇性地，由於用戶特徵有些可能對於價值用戶群來說不是重要且有意義的，因此，數據處理模組151產生的用戶特徵若不是預先知悉一定會與價值用戶群有關的，則在步驟S32中，機器學習模組152可經過學習僅挑選出對於價值用戶群來說是重要且有意義的用戶特徵。可使用機器學習算法來進行特徵工程，萃取與確認對價值用戶群(例如高價值用戶群)來說是重要且有意義的用戶特徵有哪些。通常可進行特徵選取，依重要性分數選擇前面數名(或重要性超過門限值)的用戶特徵作為價值用戶群的重要用戶特徵，且價值用戶群的這些重要用戶特徵就是對價值用戶群來說是有意義且重要的用戶特徵。接著，利用這些重要用戶特徵即可以完成第一分群模型的建立。 Optionally, since some of the user characteristics may not be important and meaningful to the value user group, if the user characteristics generated by the data processing module 151 are not known in advance, they will definitely be related to the value user group, then in step S32 , the machine learning module 152 can learn to select only the user characteristics that are important and meaningful to the value user group. Feature engineering can be performed using machine learning algorithms to extract and identify user features that are important and meaningful to value user groups (eg high value user groups). Usually, feature selection can be performed, and the first few user features (or the importance exceeding the threshold value) are selected as important user features of the value user group according to the importance score, and these important user features of the value user group are for the value user group. are meaningful and important user characteristics. Then, the establishment of the first grouping model can be completed by using these important user characteristics.

舉例來說，在知悉哪些用戶是屬於高價值用戶群，哪些用戶不是屬於高價值用戶群後，便可以透過萃取出對高價值用戶群來說是有意義且重要的用戶特徵，而對其他價值用戶群亦可用類似方式進行用戶特徵之萃取。萃取出之用戶特徵可如，高價值用戶群的用戶可能每天至少為瀏覽網頁超過10秒，或高價值用戶群的用戶每週至少會進到網頁首頁停留5秒以上，或高價值用戶群的用戶習慣透過手機登入而非個人電腦登入等。在知悉上面這些用戶特徵是重要的情況下，數據處理模組151會對用戶資料進行處理，以萃取出上述這些用戶特徵，並且機器學習模組152會依據這些用戶特徵建立出高價值用戶群的第一分群模型。 For example, after knowing which users belong to the high-value user group and which users do not belong to the high-value user group, it is possible to extract meaningful and important user characteristics for the high-value user group. Groups can also extract user features in a similar way. The extracted user characteristics can be, for example, the users of the high-value user group may browse the web page for at least 10 seconds per day, or the users of the high-value user group will enter the homepage of the web page for at least 5 seconds per week, or the high-value user group. Users are accustomed to logging in via mobile phones rather than PCs, etc. Knowing that the above user characteristics are important, the data processing module 151 will process the user data to extract the above user characteristics, and the machine learning module 152 will establish a high-value user group based on these user characteristics. The first clustering model.

用戶的用戶資料通常為原始資料，需要數據處理模組151進行處理與轉換。在此例子中，高價值用戶群的其中一個用戶的用戶資料包括「ID：0001；2020-9-1 10：00：05；登入首頁；使用裝置device=iphone12」、「ID：0001；2020-9-1 10：00：15；離開首頁；使用裝置device=iphone12」、「ID：0001；2020-9-5 14：08：30；登入首頁；使用裝置device=iphone12」、「ID：0001；2020-9-5 14：08：30；消費；消費金額=1,000；使用裝置device=iphone12」與「ID：0001；2020-9-5 14：08：50；離開首頁；使用裝置device=iphone12」。 The user data of the user is usually original data, which needs to be processed and converted by the data processing module 151 . In this example, the user profile of one of the users in the high-value user group includes "ID: 0001; 2020-9-1 10:00:05; login home page; using device device=iphone12", "ID: 0001; 2020-9-1 10: 00: 15; leave the home page; use device device=iphone12", "ID: 0001; 2020-9-5 14: 08: 30; log in to the home page; use device device=iphone12 ", "ID: 0001; 2020-9-5 14: 08: 30; consumption; consumption amount=1,000; using device device=iphone12" and "ID: 0001; 2020-9-5 14: 08: 50; leave the home page ; using device device=iphone12".

數據處理模組151可以因此自動地萃取出「ID：0001」的用戶之特徵為「每週登入頻率=2」、「平均停留時間=15」、「每週首頁停留次數=2」與「每週消費金額=1,000」與「使用裝置device=iphone12」。於此例中，「使用裝置device=iphone12」可以視為用戶非時間相關特徵，若以月底才做統計，則「每週登入頻率=2」、「平均停留時間=15」、「每週首頁停留次數=2」與「每週消費金額=1,000」都屬於用戶短期特徵，要注意是，此處雖然沒有列出用戶長期特徵，但實際上，數據處理模組151也可取得用戶長期特徵，例如「每月登入頻率=9」、「平均停留時間=12」、「每月首頁停留次數=9」與「每月消費金額=3,000」。 The data processing module 151 can thus automatically extract the characteristics of the user of “ID: 0001” as “weekly login frequency=2”, “average stay time=15”, “weekly homepage stay times=2” and “every Weekly consumption amount = 1,000" and "use device device = iphone12". In this example, "use device device=iphone12" can be regarded as a non-time-related feature of the user. If the statistics are made at the end of the month, then "weekly login frequency=2", "average stay time=15", "weekly homepage" "Number of stays = 2" and "Weekly consumption amount = 1,000" are both short-term characteristics of users. It should be noted that although the long-term characteristics of users are not listed here, in fact, the data processing module 151 can also obtain long-term characteristics of users. For example, "Monthly Login Frequency=9", "Average Stay Time=12", "Monthly Home Page Stays=9", and "Monthly Spending Amount=3,000".

在步驟S33中，基於第一分群模型，機器學習模組152進行特徵工程以萃取用戶短期特徵，並根據萃取的用戶短期特徵建立第二分群模型。在此請注意，第二分群模型也可以將用戶非時間相關特徵作為輸入參數，即機器學習模組152更可以萃取用戶非時間相關特徵，並使用萃取的用戶短期特徵與用戶非時間相關特徵建立第二分群模型。第一分群模型的用戶特徵若有用戶長期特徵、用戶短期特徵與用戶非時間相關特徵，則可以依據重要性挑選出第一分群模型中的用戶短期特徵(可選地，還可以挑選出重要且有意義的用戶非時間相關特徵)，然後利用萃取出來的用戶特徵，即可以建立第二分群模型。特徵工程可以是透過機器學習算法來實現，且本新型不以此為限制。 In step S33, based on the first grouping model, the machine learning module 152 performs feature engineering to extract short-term user features, and establishes a second grouping model according to the extracted short-term user features. Please note here that the second clustering model can also use the user non-temporal related features as input parameters, that is, the machine learning module 152 can further extract the user non-temporal related features, and use the extracted user short-term features and user non-temporal related features. Build a second clustering model. If the user characteristics of the first grouping model include long-term user characteristics, short-term user characteristics, and non-time-related characteristics of users, the short-term characteristics of users in the first grouping model can be selected according to their importance (optionally, important and short-term characteristics can also be selected. meaningful user non-time-related features), and then using the extracted user features, a second grouping model can be established. Feature engineering can be implemented through machine learning algorithms, and the present invention is not limited to this.

在一些情況下，第二分群模型有可能不具有完整的解釋力，因為第一分群模型中可能沒有足夠的用戶短期特徵，甚至完全沒有用戶短期特徵。因此，步驟S33中，數據處理模組151更可以進行用戶資料的處理，根據第一分群模型的用戶特徵延伸出多個用戶短期特徵，以使得建立的第二分群模型更具備解釋力。舉例來說，第一分群模型中其中一個的用戶長期特徵為「每月登入頻率>9」，則可以延伸出「每週登入頻率>2」或者「每雙週登入頻率>4」的用戶短期特徵。另外，進行特徵延伸的部分，其中一種做法是繪製出雷達圖，針對第一分群模型中之用戶特徵的弱項類型去延伸用戶短期特徵，以使得第二分群模型具備足夠的解釋力。 In some cases, the second clustering model may not have complete explanatory power, because the first clustering model may not have enough short-term features of users, or even no short-term features of users at all. Therefore, in step S33, the data processing module 151 can further process the user data, and extend a plurality of short-term user characteristics according to the user characteristics of the first grouping model, so that the established second grouping model has more explanatory power. For example, if one of the long-term characteristics of users in the first clustering model is "Monthly login frequency>9", the short-term users with "weekly login frequency>2" or "biweekly login frequency>4" can be extended. feature. In addition, in the feature extension part, one of the methods is to draw a radar chart to extend the short-term user features according to the weak items of the user features in the first clustering model, so that the second clustering model has sufficient explanatory power.

舉例來說，若發現對高價值用戶群的第一分群模型來說，重要且有意義的用戶短期特徵為「每週登入頻率>2」、「平均停留時間>10」與「每週消費金額>1,000」，則使用這些短期用戶特徵來建立與訓練出第二分群模型。另外，若高價值用戶群的第一分群模型中有不重要且無意義的用戶特徵為「每週首頁停留次數=2」與「使用裝置device=iphone12」，則這些用戶特徵將不被用來訓練與建立第二分群模型，以避免過擬合。 For example, if it is found that for the first segmentation model of the high-value user group, the important and meaningful short-term characteristics of users are "weekly login frequency>2", "average dwell time>10" and "weekly consumption amount> 1,000”, then use these short-term user features to build and train a second clustering model. In addition, if there are unimportant and meaningless user features in the first grouping model of the high-value user group as "weekly homepage times = 2" and "use device device = iphone12", these user features will not be used for Train and build a second clustering model to avoid overfitting.

數據處理模組151可根據一短期期間規則決定短期特徵的短期期間，短期期間可作為短期特徵值之觀察期間。數據處理模組151可根據一短期期間規則調整短期特徵的短期期間，例如數據處理模組151可針對特徵值之於時間軸上之分佈進行分析，以作為調整短期期間之依據。例如，數據處理模組151可對長期特徵資料集進行時間序列分析，如找出季節性等特徵，以採用相應之短期期間規則。短期期間可包含日曆期間，如小時、日、週、月、季等。短期時間亦可包含日曆期間之變化，如雙日、雙週等。用戶短期特徵不必然以固定週期來切分，若如瀏覽、造訪、消費等用戶行為非線性時，可利用非固定週期來設定短期期間。舉例而言，當用戶於每月初於平台裝置11之行為較為活躍，惟活躍度隨時間至月底而遞減，若採固定週期切分特徵，接近月底之特徵值可能會使預測失準，例如一用戶習慣於每月第一週消費一金額V₁>10,000元，惟至當月月底前消費金額遞減，例如第二至第四週消費金額分別為V₂=2,000、V₃=0、V₄=0，若每月消費金額為長期特徵，則若將短期特徵設為固定週期之情況下，最後一或二週之消費金額可能會將此用戶資料預測為負例。為解決此問題，短期的期間可採遞增、遞減或其他變化等方式設定，例如，週期不必然連續，如可以特定時間點為錨點觀察自該錨點開始之一窗口內的期間，例如，以每月第一日為錨點觀察往後一週內之數值，即設定一窗口為七日，將該第一週設為第一週期，並將第一週至第四週設為第二週期，或甚至忽略第一週至第四週而僅將每月第一週設為短期特徵之週期；或者，將窗口設為每週一至每週三而忽略每週四至每週日，以將每週一至三設為短期特徵之週期等。或者，週期可為以每上一個週期增加一日的期間遞增，如週期P的序列中，P₁=1日、P₂=2日、P₃=3日，P₄=4日…P_n=n日等。或者，短期特徵之週期可採梯度方式增加，如註冊會員後一個月內，短期期間為每週，註冊會員後二至六個月內，短期期間為每雙週等。週期亦可重疊，如進行窗口滑動以判定短期期間，如窗口設為3日時，9月1日至9月3日為一短期期間，9月2日至9月4日為一短期期間，9月3日至9月5日為一短期其間等。從上開敘述可得知，短期特徵係相對於長期特徵而言，因此長期特徵亦可能為非固定或與短期特徵有相應變化，以該梯度方式遞增週期為例，註冊會員後一個月內，長期特徵可為具有一個月之期間，而註冊會員後二至六個月內，長期特徵可為具有六個月之期間。 The data processing module 151 can determine the short-term period of the short-term feature according to a short-term period rule, and the short-term period can be used as the observation period of the short-term feature value. The data processing module 151 can adjust the short-term period of the short-term feature according to a short-term period rule. For example, the data processing module 151 can analyze the distribution of the feature value on the time axis as a basis for adjusting the short-term period. For example, the data processing module 151 can perform time series analysis on the long-term feature data set, such as finding out seasonality and other features, so as to apply corresponding short-term period rules. Short-term periods can include calendar periods such as hours, days, weeks, months, quarters, and so on. Short-term time periods can also include changes in calendar periods, such as bi-daily, bi-weekly, etc. The short-term characteristics of users are not necessarily divided into fixed periods. If user behaviors such as browsing, visiting, and consumption are non-linear, a non-fixed period can be used to set short-term periods. For example, when the user's behavior on the platform device 11 is relatively active at the beginning of each month, but the activity decreases with time to the end of the month, if a fixed period is used to segment the feature, the feature value near the end of the month may make the prediction inaccurate. Users are used to spending an amount V ₁ >10,000 yuan in the first week of each month, but the consumption amount decreases until the end of the month. For example, the consumption amount in the second to fourth weeks is V ₂ =2,000, V ₃ =0, V ₄ = 0, if the monthly consumption amount is a long-term feature, and if the short-term feature is set to a fixed period, the consumption amount in the last one or two weeks may predict this user data as a negative example. In order to solve this problem, the short-term period can be set by increasing, decreasing or other changes. For example, the period is not necessarily continuous. For example, a specific time point can be used as the anchor point to observe the period within a window starting from the anchor point, for example, Take the first day of each month as the anchor point to observe the values in the next week, that is, set a window of seven days, set the first week as the first cycle, and set the first to fourth weeks as the second cycle , or even ignore weeks 1 to 4 and only make the first week of the month a period of short-term features; or, set the window to be every Monday through Wednesday and ignore every Thursday through Sunday to make every Monday to Wednesday as a short-term characteristic cycle, etc. Alternatively, the period may be incremented by adding one day to the previous period, for example, in the sequence of period P, P ₁ =1 day, P ₂ =2 day, P ₃ =3 day, P ₄ =4 day...P _n = n days etc. Alternatively, the period of short-term features can be increased in a gradient manner, for example, within one month after membership registration, the short-term period is weekly, within two to six months after registration, the short-term period is every two weeks, etc. Periods can also overlap, such as window sliding to determine the short-term period, if the window is set to 3 days, September 1st to September 3rd is a short-term period, September 2nd to September 4th is a short-term period, 9 The period from March 3 to September 5 is a short-term period. It can be seen from the above description that short-term characteristics are relative to long-term characteristics, so long-term characteristics may also be non-fixed or have corresponding changes with short-term characteristics. The long-term feature may have a period of one month, and within two to six months after registration as a member, the long-term feature may have a period of six months.

接著，在第二分群模型建立完畢後，於步驟S34中，模型部署模組153部署第二分群模型，並且使用第二分群模型根據用戶的用戶短期用戶特徵預測用戶是否屬於第二分群模型的價值用戶群。簡單地說，在部署完第二分群模型後，每當用戶登入後，預測裝置15便能夠不用等到目標時間(如以長期特徵的長期期間)到達，便可以知悉用戶是否屬於第二分群模型的價值用戶群。附帶一提的是，使用第二分群模型根據用戶的用戶短期用戶特徵預測用戶是否屬於第二分群模型的價值用戶群的做法可以是透過特徵比對，例如類神經網路的分類或計算餘弦相似性。 Next, after the second grouping model is established, in step S34, the model deployment module 153 deploys the second grouping model, and uses the second grouping model according to the user's user short-term User characteristics predict whether the user belongs to the valuable user group of the second grouping model. Simply put, after the second grouping model is deployed, every time the user logs in, the prediction device 15 can know whether the user belongs to the second group without waiting for the target time (such as a long-term period with long-term characteristics) to arrive. The value user base of the model. Incidentally, the practice of using the second clustering model to predict whether a user belongs to the valuable user group of the second clustering model based on the user's short-term user characteristics can be done through feature comparison, such as neural network-like classification or calculation. Cosine similarity.

之後，在對用戶依據其用戶短期特徵預測其是否屬於價值用戶群之後，在步驟S35中，行銷模組154根據用戶是否屬於第二分群模型的價值用戶群的預測結果，決定對用戶的行銷策略。例如，若根據用戶一週的用戶資料擷取出短期特徵，並根據用戶的用戶短期特徵預測用戶為高價值用戶群，則可以直接發送準高價值用戶或高價值用戶的折價卷或促銷活動給用戶，以藉此增加交易或訂閱的成功率。附帶一提的是，預測用戶為價值用戶群的預測結果可以是以機率的方式呈現，例如，50%、70%或90%，因此，行銷模組154可以根據預測結果的機率值給予不同優惠資訊給用戶。 After that, after predicting whether the user belongs to the value user group according to the short-term characteristics of the user, in step S35, the marketing module 154 determines the marketing to the user according to the prediction result of whether the user belongs to the value user group of the second grouping model Strategy. For example, if short-term features are extracted based on the user's user data for a week, and the user is predicted to be a high-value user group based on the user's short-term features, the discount coupons or promotions for quasi-high-value users or high-value users can be directly sent to the user. to increase the success rate of transactions or subscriptions. Incidentally, the predicted result of the predicted user as a value user group can be presented in the form of probability, for example, 50%, 70% or 90%, therefore, the marketing module 154 can give different discounts according to the probability value of the predicted result information to users.

舉例來說，多個用戶的多個用戶短期徵分別為「ID：0001；每週登入頻率=2；平均停留時間=15；每週消費金額=800」、「ID：0002；每週登入頻率=0.01；平均停留時間=7；每週消費金額=600」、「ID：0003；每週登入頻率=1；平均停留時間=50；每週消費金額=50,000」、「ID：0004；每週登入頻率=3；平均停留時間=25；每週消費金額=1,300」與、「ID：0005；每週登入頻率=10；平均停留時間=50；每週消費金額=15,300」，以及第二分群模型的用戶短期特徵為「每週登入頻率>2」、「平均停留時間>10」與「每週消費金額>1,000」，則可以預測「ID：0004」與「ID：0005」的用戶屬於第二分群模型的價值用戶群。 For example, multiple user short-term signatures of multiple users are "ID: 0001; weekly login frequency=2; average stay time=15; weekly consumption amount=800", "ID: 0002; weekly login frequency =0.01; Average length of stay=7; Weekly spend=600", "ID: 0003; Weekly login frequency=1; Average length of stay=50; Weekly spend=50,000", "ID: 0004; Weekly Login Frequency=3; Average Stay Time=25; Weekly Spend=1,300" and "ID: 0005; Weekly Login Frequency=10; Average Stay Time=50; Weekly Spend=15,300", and the second point The short-term characteristics of users in the group model are "weekly login frequency>2", "average stay time>10" and "weekly consumption amount>1,000", it can be predicted that users with "ID: 0004" and "ID: 0005" belong to The value user group of the second grouping model.

在此請注意，上述第二分群模型可以是依短期特徵之週期更新，如依最短週期更新、依第二分群模型之所有特徵中具最短週期之特徵的週期更新，或者隨時、即時、或近乎即時地更新，如更新頻率可以是一個小時或兩個小時一次。例如，即使是因為舉辦一活動才使得用戶的消費金額變多，該活動可為節日性、單次性、臨時性、突發性、短期性活動等，則舉辦活動促使用戶增加消費的用戶短期特徵也可被當作第二分群模型的用戶短期特徵，即使用與活動相關的用戶短期特徵來增補原來的多個用戶短期特徵。在活動舉辦結束後，因為第二分群模型是隨時地更新，所以活動舉辦結束後的一段時間後，第二分群模型不會將舉辦活動促使用戶增加消費的用戶短期特徵作為其用戶短期特徵，即移除自多個短期用戶特徵中與活動相關的用戶短期特徵。另外一種做法是準備兩個第二分群模型，其中一個第二分群模型是不考慮舉辦活動相關的用戶短期特徵以做為常態使用的第二分群模型，另一個第二分群模型則是考慮舉辦活動相關的用戶短期特徵以做為活動舉辦使用的第二分群模型。例如，舉辦活動是促銷某類商品，則活動舉辦使用的第二分群模型可以將用戶之興趣與在每週(假設活動時間為一個月)舉辦活動期間所購買之金額的用戶短期特徵作為第二分群模型的用戶短期特徵。 Please note that the above-mentioned second clustering model can be updated according to the period of short-term features, such as updating according to the shortest period, according to the period of the characteristic with the shortest period among all the features of the second grouping model, or any time, real-time, Or update almost instantaneously, such as every hour or two. For example, even if the user's consumption amount increases because of an event, which can be festive, one-off, temporary, sudden, short-term, etc., the user who organizes the event to increase the user's consumption will be short-term. The features can also be regarded as the user short-term features of the second clustering model, ie, the user short-term features related to the activity are used to supplement the original multiple user short-term features. After the event is over, because the second grouping model is updated at any time, after a period of time after the event is over, the second grouping model will not take the short-term user characteristics that the event promotes users to increase consumption as its short-term user characteristics , that is, the user short-term features related to the activity are removed from the multiple short-term user features. Another approach is to prepare two second grouping models. One of the second grouping models is a second grouping model that does not consider the short-term characteristics of users related to holding activities for normal use, and the other is a second grouping model. It is a second grouping model that considers the short-term characteristics of users related to holding activities as a second grouping model for holding activities. For example, if the event is to promote a certain type of product, the second grouping model used for the event can use the short-term characteristics of the user's interest and the amount purchased during the event every week (assuming the event time is one month) as the first grouping model. User short-term characteristics of a dichotomous model.

另外，前述用戶資料的來源與類型不以前述內容為限制。用戶資料的原始資料更可以包括透過物聯網或其他裝置感測到並傳送到預測裝置。例如，用戶到賣場停車，停車繳費機會記錄用戶的停車時間與次數等，第二分群模型可以將用戶每週的停車次數與來店消費次數的比率作為其用戶短期特徵。除此之外，同一個用戶的用戶資料的資料來源可是一個以上。例如，透過不同分店之實體連鎖店的POS獲取的用戶資料以獲得用戶短期特徵。又例如，用戶因為調職，故不在原先工作地點的連鎖咖啡店消費，而改成在工作地點的連鎖咖啡店消費，故將兩個資料來源的用戶資料做統整後，才能正確地獲得用戶的用戶短期特徵，此亦示意了非時間特徵與短期特徵的整合應用。 In addition, the source and type of the aforementioned user data are not limited by the aforementioned content. The raw data of the user data may also include sensing and transmission to the prediction device through the Internet of Things or other devices. For example, when a user parks in a store, the parking payment opportunity records the user's parking time and number of times, etc. The second clustering model can take the ratio of the user's weekly parking times to the number of visits to the store as its short-term user characteristics. In addition, there may be more than one data source for the user data of the same user. For example, user data obtained through the POS of physical chain stores in different branches to obtain short-term user characteristics. Another example is a chain coffee shop where the user is not at the original workplace because of the transfer. Therefore, the short-term user characteristics of users can be obtained correctly after the user data from the two data sources are integrated, which also indicates the integration of non-temporal characteristics and short-term characteristics. application.

請接著參照圖4，圖4是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法或裝置建立的RFM預測分群與當前RFM分群的示意圖。RFM預測分群可指針對RFM模型分群結果之各價值客戶群以前述短期特徵方式預測符合各價值客戶群之客戶。依據目標時間的用戶資料進行RFM分群後的當前RFM分群結果41中的高價值用戶群411與依據本新型實施例預測方法或裝置預測的RFM分群結果42的高價值用戶群421可能會有不同的用戶，可以將預測的RFM分群結果42的高價值用戶群421的用戶視為高價值或準高價值的用戶，直接進行行銷。也就是說，可以直接將高價值用戶群421用戶加入到高價值用戶群411，以優化當前RFM分群結果41，盡可能找出高價值的潛在用戶。簡單地說，本新型實施例的預測方法更包括將預測的價值用戶群與當前的價值用戶群進行合併的步驟。另外，高價值用戶群411、421之間的用戶變動也可以用於預測目前的行銷策略是否正確，並說明如下。 Please refer to FIG. 4 . FIG. 4 is a schematic diagram of the RFM prediction grouping and the current RFM grouping established by the method or apparatus for predicting value user grouping for users based on short-term characteristics of users according to the novel embodiment of the present invention. The RFM prediction grouping can refer to predicting the customers who meet each value customer group in the aforementioned short-term characteristic method for each value customer group based on the grouping result of the RFM model. The high-value user group 411 in the current RFM grouping result 41 after the RFM grouping is performed according to the user data at the target time may be different from the high-value user group 421 in the RFM grouping result 42 predicted according to the prediction method or device of the novel embodiment. Users, users of the high-value user group 421 of the predicted RFM grouping result 42 can be regarded as high-value or quasi-high-value users, and can be directly marketed. That is to say, users in the high-value user group 421 can be directly added to the high-value user group 411, so as to optimize the current RFM grouping result 41 and find potential users with high value as much as possible. To put it simply, the prediction method of this novel embodiment further includes the step of combining the predicted value user group with the current value user group. In addition, the user changes between the high-value user groups 411 and 421 can also be used to predict whether the current marketing strategy is correct, and the description is as follows.

請同時參照圖5A與圖5B，圖5A是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法或裝置於第一時間獲得之價值用戶群及當前的價值用戶群的示意圖，以及圖5B是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法或裝置於第二時間獲得之價值用戶群及當前的價值用戶群的示意圖。於第一時間，預測的價值用戶群52及當前的價值用戶群51之間的交集531，但在第二時間，預測的價值用戶群52及當前的價值用戶群51之間的交集532相較於交集531卻變小，亦即交集之用戶數量變少，而對交集之用戶及交集之變化可對其中之用戶作統計分析，以了解原客戶(位於交集531中之價值用戶群51)可能不再符合價值用戶群51或流失之情況，因此，可以進一步了解針對價值客戶群51的行銷策略是否可能出錯或需要調整，才導致價值用戶群的用戶流失，故可判定是否需調整行銷策略。 Please refer to FIG. 5A and FIG. 5B at the same time. FIG. 5A is a schematic diagram of the value user group and the current value user group obtained at the first time by the method for predicting the value user group or the device according to the short-term characteristics of the user according to the new embodiment of the present invention, 5B is a schematic diagram of the value user group and the current value user group obtained at a second time by the method or device for predicting value user groups based on short-term characteristics of users according to the novel embodiment of the present invention. At the first time, the intersection 531 between the predicted value user group 52 and the current value user group 51, but at the second time, the predicted price The intersection 532 between the value user group 52 and the current value user group 51 is smaller than the intersection 531, that is, the number of users in the intersection becomes less, and the users in the intersection and the changes in the intersection can be counted on the users in it. Analysis to understand that the original customer (the value user group 51 in the intersection 531) may no longer meet the value user group 51 or is lost. Therefore, it is possible to further understand whether the marketing strategy for the value customer group 51 may be wrong or needs to be adjusted, It leads to the loss of users of the valuable user group, so it can be determined whether the marketing strategy needs to be adjusted.

請同時參照圖5A與圖5C，圖5C是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法或裝置於第二時間獲得之價值用戶群及當前的價值用戶群的另一示意圖。於第一時間，預測的價值用戶群52及當前的價值用戶群51之間的交集531，但在第二時間，預測的價值用戶群52及當前的價值用戶群51之間的交集533卻變大，因此，可以知悉行銷策略可能有產生效果，故可以維持原行銷策略。某些實施例中，預測裝置15亦可包含或電性連接一顯示模組(未繪示)，該顯示模組可根據用戶群51及52之資料將類似於圖5A-5C之用戶群圖形化方式呈現於螢幕上，以供監控預測的價值用戶群52之間的變化或當前的價值用戶群51與預測的價值用戶群52之間的變化。簡單地說，本新型的預測裝置以用戶短期特徵對用戶進行價值用戶分群的預測的價值用戶群與當前的價值用戶群在不同時間點的交集變化可以拿來被監控，以判斷行銷策策略是否得當，亦即本新型實施例的預測方法更包括此處所講的監控交集變化以判斷行銷策策略是否正確的步驟。 Please refer to FIG. 5A and FIG. 5C at the same time. FIG. 5C is another example of the value user group and the current value user group obtained at a second time by the method or device for predicting value user groups based on short-term characteristics of users according to the new embodiment of the present invention. Schematic. At the first time, the intersection 531 between the predicted value user group 52 and the current value user group 51, but at the second time, the intersection 533 between the predicted value user group 52 and the current value user group 51 changes. Therefore, it can be known that the marketing strategy may have an effect, so the original marketing strategy can be maintained. In some embodiments, the prediction device 15 may also include or be electrically connected to a display module (not shown), and the display module may display user group graphs similar to those shown in FIGS. 5A-5C according to the data of the user groups 51 and 52 . The transformation is presented on the screen for monitoring the change between the predicted value user group 52 or the change between the current value user group 51 and the predicted value user group 52 . To put it simply, the new type of prediction device uses the short-term characteristics of users to predict the value user group for users. The intersection changes of the value user group and the current value user group at different time points can be monitored to determine whether the marketing strategy is not. Appropriately, that is, the forecasting method of the present novel embodiment further includes the step of monitoring the intersection changes to determine whether the marketing strategy is correct.

在某些實施例中，本新型可藉由一電腦軟體程式執行本揭露之模組或方法，其中所述電腦軟體程式可儲存於非揮發性的儲存媒介，例如快閃記憶體裝置、磁性儲存裝置、光碟或其他類型的儲存媒介。例如，本新型之裝置可藉由一電腦軟體程式利用一機器學習演算法對第一用戶資料集進行擬合，以選定一用戶短期特徵集。該用戶短期特徵集可包含至少一短期特徵。該第一用戶資料集可包含具第一週期性之用戶長期特徵集。該用戶長期特徵集可包含長期特徵資料。該電腦軟體程式可對用戶資料集進行特徵工程，以選定該用戶短期特徵集。該特徵工程可包含特徵重要性分析，以作為選定該用戶短期特徵集之依據。該特徵工程可包含短期特徵萃取，以自該用戶長期特徵集中萃取出該用戶短期特徵集。該用戶短期特徵集可包含具第二週期性之用戶短期特徵資料。該第一週期性可包含第一週期，該第二週期性可包含第二週期；對於同一特徵而言，該第二週期可短於該第一週期。該電腦軟體程式可依據該用戶短期特徵集對該第一用戶資料集進行處理，以產生第二用戶資料集。對該第一用戶資料集進行處理可包含對該第一用戶資料集刪除該用戶短期特徵集之外之特徵。該電腦軟體程式可利用該機器學習演算法對該第二用戶資料集進行擬合，以產生一用戶分群預測模型。該電腦軟體程式可利用該用戶分群預測模型對第三用戶資料集進行預測，以產生第一預測結果。該第三用戶資料集可包含至少某一特定用戶之用戶資料。該第三用戶資料集可包含該用戶短期特徵集。該第一預測結果除了包括某一特定用戶是否屬於特定族群的預測結果外，更可包含第四用戶資料集。該第四用戶資料集可包含以該用戶分群預測模型預測為正例之用戶資料。該電腦軟體程式可對該第一用戶資料集與該第四用戶資料集進行聯集分析，以取得第一聯集分析結果。該電腦軟體程式可對該第一用戶資料集與該第四用戶資料集進行資料合併，以取得第五用戶資料集。於產生該第一預測結果之後，該電腦軟體程式可利用該用戶分群預測模型對第六用戶資料集進行預測，以產生第二預測結果。該第六用戶資料集可包含至少某一特定用戶之用戶資料。該第六用戶資料集可包含該用戶短期特徵集。該第二預測結果除了包括某一特定用戶是否屬於特定族群的預測結果外，更可包含第七用戶資料集。該第七用戶資料集可包含以該用戶分群預測模型預測為正例之用戶資料。該電腦軟體程式可對該第一用戶資料集與該第七用戶資料集進行資料合併，以取得第八用戶資料集。該電腦軟體程式可對該第一用戶資料集與該第七用戶資料集進行聯集分析，以取得第二聯集分析結果。該電腦軟體程式可對該第一聯集分析結果與該第二聯集分析結果進行交集比對，以產生一交集比對結果，其中該交集比對結果可以更真實地反映某一個特定族群的用戶短期特徵，該電腦軟體程式可以根據該交集比對結果以建立出用於判斷或預測用戶是否屬於該特定族群的分類模型。 In some embodiments, the present invention can implement the modules or methods of the present disclosure by a computer software program, wherein the computer software program can be stored in a non-volatile storage medium, such as Such as flash memory devices, magnetic storage devices, optical discs or other types of storage media. For example, the device of the present invention can use a computer software program to fit a first user data set using a machine learning algorithm to select a user short-term feature set. The user short-term feature set may include at least one short-term feature. The first set of user data may include a set of long-term user characteristics with a first periodicity. The user long-term feature set may contain long-term feature data. The computer software program can perform feature engineering on the user data set to select the user short-term feature set. The feature engineering may include feature importance analysis as a basis for selecting the short-term feature set of the user. The feature engineering may include short-term feature extraction to extract the user's short-term feature set from the user's long-term feature set. The user short-term feature set may include user short-term feature data with a second periodicity. The first periodicity may include a first period, and the second periodicity may include a second period; for the same feature, the second period may be shorter than the first period. The computer software program can process the first user data set according to the user short-term feature set to generate a second user data set. Processing the first set of user data may include deleting features from the first set of user data that are outside the set of short-term features of the user. The computer software program can use the machine learning algorithm to fit the second user data set to generate a user group prediction model. The computer software program can use the user group prediction model to predict the third user data set to generate a first prediction result. The third set of user data may include user data of at least one specific user. The third set of user data may include the set of short-term characteristics of the user. In addition to the prediction result of whether a specific user belongs to a specific group, the first prediction result may further include a fourth user data set. The fourth user data set may include user data predicted by the user group prediction model as positive examples. The computer software program can perform union analysis on the first user data set and the fourth user data set to obtain a first union analysis result. The computer software program can perform data merging on the first user data set and the fourth user data set to obtain a fifth user data set Account data set. After generating the first prediction result, the computer software program may use the user group prediction model to predict the sixth user data set to generate a second prediction result. The sixth user data set may include user data of at least one specific user. The sixth set of user profiles may include the set of short-term characteristics of the user. In addition to the prediction result of whether a specific user belongs to a specific group, the second prediction result may further include a seventh user data set. The seventh user data set may include user data predicted by the user group prediction model as positive examples. The computer software program can perform data merging on the first user data set and the seventh user data set to obtain an eighth user data set. The computer software program can perform a union analysis on the first user data set and the seventh user data set to obtain a second union analysis result. The computer software program can perform an intersection comparison between the first union analysis result and the second union analysis result to generate an intersection comparison result, wherein the intersection comparison result can more truly reflect the situation of a particular group. User short-term characteristics, the computer software program can establish a classification model for judging or predicting whether the user belongs to the specific group according to the intersection comparison result.

某些實施例中，本新型係包含一可用於用戶資料處理之電腦程式產品，其中所述電腦軟體程式可儲存於非揮發性的儲存媒介，例如快閃記憶體裝置、磁性儲存裝置、光碟或其他類型的儲存媒介，且所述電腦軟體程式經由電腦載入後用以預測並擴充用戶資料集，係執行包含以下之步驟：對第一用戶資料集進行特徵工程，使該第一用戶資料集包含具有一用戶長期特徵集之用戶資料，該用戶長期特徵集包含具第一週期性之特徵，該第一用戶資料集包含第一用戶數量之用戶資料；利用一機器模型演算法對該第一用戶資料集進行擬合，以產生一用戶短期特徵集，該用戶短期特徵集包含具第二週期性之特徵，其中該用戶長期特徵集之第一週期性具有第一週期，該用戶短期特徵集之第二週期性具有第二週期，該第一週期長於該第二週期；依據該用戶短期特徵集對該第一用戶資料集進行特徵選定，以產生第二用戶資料集，使該第二用戶資料集包含具有該用戶短期特徵之用戶資料，利用該機器學習演算法對該第二用戶資料集進行擬合，以產生一用戶群集預測模型；利用該用戶群集預測模型對第三用戶資料集進行預測，以產生第一用戶群集預測結果，該第一用戶群集預測結果除了包括某一特定用戶是否屬於特定族群的預測結果外，更包含第四用戶資料集；及對該第一用戶資料集與該第四用戶資料集進行資料合併，以產生第五用戶資料集，該第五用戶資料集包含第二用戶數量之用戶資料，該第五用戶資料集之第二用戶數量高於該第一用戶資料集之第一用戶數量。 In some embodiments, the present invention includes a computer program product that can be used for user data processing, wherein the computer software program can be stored in a non-volatile storage medium, such as a flash memory device, a magnetic storage device, an optical disk or a Other types of storage media, and the computer software program is loaded by the computer to predict and expand the user data set, and the steps include the following: performing feature engineering on the first user data set to make the first user data set. Including user data having a long-term user feature set, the user long-term feature set including features with a first periodicity, the first user data set including user data of a first number of users; using a machine model algorithm for the first user data set The user data set is fitted to generate a user short-term feature set, the user short-term feature set includes features with a second periodicity, wherein the first periodicity of the user long-term feature set has The first period, the second period of the user short-term feature set has a second period, and the first period is longer than the second period; the first user data set is feature selected according to the user short-term feature set to generate a second period a user data set, so that the second user data set contains user data with the short-term characteristics of the user, and the machine learning algorithm is used to fit the second user data set to generate a user cluster prediction model; using the user cluster The prediction model predicts the third user data set to generate a first user cluster prediction result, and the first user cluster prediction result includes a fourth user data set in addition to the prediction result of whether a specific user belongs to a specific ethnic group; and performing data merging on the first user data set and the fourth user data set to generate a fifth user data set, the fifth user data set includes the user data of the second number of users, and the first user data set in the fifth user data set 2. The number of users is higher than the number of first users in the first user data set.

在某些實施例中，上述用戶資料集可包含複數個不同用戶之資料，上述特徵資料集可包含複數個不同資料屬性之特徵欄位，各該特徵欄位包含複數個特徵值，各該特徵欄位之複數個特徵值可構成一特徵值範圍。該電腦軟體程式可依據各該特徵欄位之特徵值範圍計算或選取一特徵門檻值。前述特徵資料集之資料可包含特徵門檻值。各該特徵門檻值可為各該特徵欄位之特徵值範圍中之最大值、最小值、平均值、中位數、眾數、百分位數、四分位數、標準差等。依據不同特徵值的屬性，該電腦軟體程式可利用不同的特徵門檻值判斷規則來決定特徵門檻值。例如，該電腦軟體程式可判斷特徵值與該機器學習演算法所擬合之目標變數或標籤的關聯性，若判斷該關聯性為正相關，可計算或選取特徵值範圍中之最小值作為特徵門檻值，而若判斷該關聯性為負相關，可計算或選取特徵值範圍中之最大值作為特徵門檻值。例如，若該機器學習演算法之目標變數為消費金額，特徵為每週登入頻率，特徵值範圍之最小值為2，該電腦軟體程式可判斷消費金額與每週登入頻率具有正相關之關聯性，可選取該最小值作為特徵門檻值。在某些例子中，若目標變數與特徵之間具有弱關聯性，該電腦軟體程式可計算或選取特徵值範圍之算術平均數作為特徵門檻值。在某些例子中，若特徵為計數屬性，該電腦軟體程式可計算或選取特徵值範圍中之眾數作為特徵門檻值。在某些例子中，該電腦軟體程式可設定一門檻值系數h=0,0.5,1,1.8,2,...並計算特徵值範圍之統計數據與該門檻值系數之乘積作為特徵門檻值；例如，若設定h=0.5，前述每週登入頻率之特徵門檻值便為1。在某些例子中，該電腦軟體程式可接收一特徵門檻值之設定指令，以依據該設定指令設定特徵門檻值。 In some embodiments, the user data set may include data of a plurality of different users, the feature data set may include a plurality of feature fields of different data attributes, each of the feature fields may include a plurality of feature values, each of the feature A plurality of eigenvalues of a field can constitute a eigenvalue range. The computer software program can calculate or select a characteristic threshold value according to the characteristic value range of each characteristic field. The data of the aforementioned feature data set may include feature threshold values. Each of the characteristic thresholds can be the maximum value, the minimum value, the mean value, the median, the mode, the percentile, the quartile, the standard deviation, and the like in the characteristic value range of each of the characteristic fields. According to the attributes of different eigenvalues, the computer software program can use different judging rules for the eigenvalues to determine the eigenvalues. For example, the computer software program can determine the correlation between the eigenvalue and the target variable or label fitted by the machine learning algorithm. If the correlation is determined to be a positive correlation, it can calculate or select the minimum value in the range of eigenvalues as the feature Threshold value, and if it is judged that the correlation is negative correlation, the maximum value in the range of eigenvalues can be calculated or selected as the characteristic threshold value. For example, if the target variable of the machine learning algorithm is consumption money Amount, the characteristic is the weekly login frequency, and the minimum value of the characteristic value range is 2. The computer software program can determine that the consumption amount has a positive correlation with the weekly login frequency, and can select the minimum value as the characteristic threshold value. In some instances, if there is a weak correlation between the target variable and the feature, the computer software program may calculate or select the arithmetic mean of the range of feature values as the feature threshold. In some instances, if the feature is a count attribute, the computer software program may calculate or select the mode of the feature value range as the feature threshold value. In some examples, the computer software program may set a threshold coefficient h=0,0.5,1,1.8,2,... and calculate the product of the eigenvalue range statistics and the threshold value coefficient as the eigenthreshold value ; For example, if h=0.5 is set, the characteristic threshold of the aforementioned weekly login frequency is 1. In some examples, the computer software program can receive a characteristic threshold setting command to set the characteristic threshold according to the setting command.

本新型之功效在於可以透過用戶的用戶短期特徵，如每日或單次消費行為，對用戶進行用戶價值評分預測(即，預測用戶是否屬於某一個價值用戶群)，故可針對到用戶個人，讓用戶對現行服務的反應更加即時全面，無須等到最終消費結果產生才反應。進一步地，透過本新型可以有效地維持高價值用戶，以及將準高價值用戶趁早變成高價值用戶，甚至可以彌補既有客戶分群如現有RFM分群的不足，將準高價值用戶視為高價值用戶(即，合併預測高價值用戶群與當前分群的高價值用戶群)，以提升交易與訂閱的金額跟成功率。 The function of the new model is that it can predict the user's value score (that is, predict whether the user belongs to a certain value user group) through the user's short-term characteristics of the user, such as daily or single consumption behavior, so it can be aimed at individual users. Allow users to respond to the current service more instantly and comprehensively, without having to wait for the final consumption result to respond. Further, through this new model, high-value users can be effectively maintained, and quasi-high-value users can be turned into high-value users as soon as possible, and even the deficiencies of existing customer groups such as existing RFM groups can be compensated, and quasi-high-value users can be regarded as high-value users. (That is, combine the predicted high-value user group with the high-value user group of the current group) to increase the amount and success rate of transactions and subscriptions.

上述對「本新型」之描述、各實施例、變形例中的記載以及圖式中揭露的內容僅為用於說明請求項中記載的新型的一例，依據本新型精神亦可有包含不同特徵或實施態樣之各種實施組合，因此請求項中記載的新型不受上述對「本新型」之描述、實施例或圖式中揭露的內容所限定。本申請最初的請求項中的記載僅僅是一個示例，可以根據說明書、圖式等的記載對請求項中的記載進行適宜的變更。 The above description of "the present invention", the descriptions in the various embodiments, modified examples and the contents disclosed in the drawings are only an example for describing the new model described in the claims, and may also include different features or Various implementation combinations of the implementation aspects, so the new type described in the claims is not limited by the content disclosed in the above description, embodiments or drawings of the "new type". this application The description in the first claim is just an example, and the description in the claim can be appropriately changed according to the description in the description, drawings, and the like.

1:價值用戶分群系統 1: Value user grouping system

11:平台裝置 11: Platform device

111:瀏覽行為 111: Browsing Behavior

112:興趣喜好 112: Interests

113:裝置資訊 113: Device Information

12:用戶管理系統 12: User management system

121:人口統計資料 121: Demographics

122:消費紀錄 122:Consumption record

123:帳務紀錄 123: Accounting records

124:互動紀錄 124: Interactive Record

13:公開資料來源裝置 13: Open Source Device

131:氣候地理 131: Climatic Geography

132:金融經濟 132: Financial Economics

133:人文人口 133: Humanities Population

134:醫療生物 134: Medical Biology

14:資料庫 14:Database

15:預測裝置 15: Prediction device

151:數據處理模組 151: Data processing module

152:機器學習模組 152: Machine Learning Modules

153:模型部署模組 153: Model Deployment Module

154:行銷模組 154: Marketing Module

Claims

A prediction device for predicting whether a user belongs to a value user group based on short-term characteristics of the user, comprising electrically connecting a plurality of hardware circuits to each other, and configuring a plurality of modules, wherein: a data processing module in the plurality of modules group to obtain a grouping result, wherein the grouping result indicates that at least one user belongs to a value user group; a machine learning module in the multiple modules implements a feature engineering based on a machine learning algorithm to obtain the value user multiple user characteristics of the group, and establish a first grouping model of the valuable user group based on the multiple user characteristics, wherein at least one of the multiple user characteristics is a long-term user characteristic, and the user long-term characteristic is A feature that can only be obtained after reaching a target time; the machine learning module extracts a plurality of short-term features of users from the first grouping model, and uses the short-term features of the plurality of users to establish a second feature of the valuable user group a clustering model, wherein the user short-term characteristic is a characteristic that can be obtained before the target time is reached; and a model deployment module of the plurality of modules uses the second clustering model according to another A plurality of user short-term characteristics of a user predict whether the other user belongs to the value user group.

The prediction device of claim 1, further comprising grouping the plurality of users according to the plurality of user data obtained by the plurality of users when the target time arrives, so as to obtain the grouping result.

The prediction apparatus according to claim 1, wherein a plurality of user characteristics whose importance is the top or exceeds a threshold value in the valuable user group are selected as the plurality of user characteristics of the valuable user group.

The prediction device according to claim 1, wherein a plurality of short-term features of users whose importance is the top or exceeds a threshold value in the first clustering model are selected as the multiple short-term features of the second clustering model. short-term characteristics of users.

The prediction apparatus according to claim 1, wherein a plurality of short-term features of users whose importance is the top few or exceeding a threshold value in the first grouping model are selected, and a plurality of short-term features of the plurality of users are extended from the selected short-term features A plurality of short-term features of users are obtained, and the selected short-term features of users and the extended short-term features of users are used as the short-term features of users in the second grouping model.

The prediction device of claim 1, wherein the prediction device predicts a plurality of users as belonging to the value user group using the second grouping model; and the prediction device monitors the predicted multiplicity of the value user group A temporal change of an intersection between a user and the multiple users of the valuable user group of the grouping result is used to judge whether the marketing strategy of the multiple users is correct.

The prediction device of claim 1, wherein the prediction device predicts a plurality of users as belonging to the value user group using the second grouping model; and the prediction device predicts a plurality of users of the value user group The individual users are integrated with the plurality of users of the valuable user group of the grouping result as the plurality of users of the valuable user group.

The prediction device of claim 1, wherein the prediction device updates the second grouping model during an event to supplement the plurality of short-term user features with at least one short-term user feature related to an event, and after the event holding period ends, the second grouping model is updated to remove the user short-term feature related to the event from the plurality of short-term user features.

The prediction device of claim 1, wherein the prediction device prepares another second clustering model; the prediction device updates the other second clustering model and uses the other second clustering model during an event. A binary group model predicts whether the other user belongs to the value user group; and the prediction device does not update the second group model during the event period.

A predictive device that predicts whether a user belongs to a valuable user group based on the short-term characteristics of the user. The device includes a plurality of hardware circuits that are electrically connected to each other. After the prediction device loads a computer program for predicting and expanding the user data set, it configures a plurality of modules that are signal-connected to each other. The modules include A data processing module and a machine learning module cooperate with each other to perform steps including: performing feature engineering on a first set of user data such that the first set of user data includes a set of long-term features of the user. user data, the user long-term feature set includes features with a first periodicity, the first user data set includes user data of a first number of users; a machine model algorithm is used to fit the first user data set to obtain generating a user short-term feature set, the user short-term feature set includes features with a second periodicity, wherein the first periodicity of the user long-term feature set has a first period, and the second periodicity of the user short-term feature set has a second periodicity period, the first period is longer than the second period; the first user data set is selected according to the short-term feature set of the user to generate a second user data set, so that the second user data set contains the short-term features of the user the user data set, use the machine learning algorithm to fit the second user data set to generate a user cluster prediction model; use the user cluster prediction model to predict the third user data set to generate a first user cluster a prediction result, the first user cluster prediction result includes a fourth user data set; and data merging of the first user data set and the fourth user data set to generate a fifth user data set, the fifth user data set User data including the second user data set, the second user data set of the fifth user data set is higher than the first user data set of the first user data set.