TWM624658U - Prediction devices for predicting whether users belong to valuable user groups based on short-term user characteristics - Google Patents
Prediction devices for predicting whether users belong to valuable user groups based on short-term user characteristics Download PDFInfo
- Publication number
- TWM624658U TWM624658U TW110210571U TW110210571U TWM624658U TW M624658 U TWM624658 U TW M624658U TW 110210571 U TW110210571 U TW 110210571U TW 110210571 U TW110210571 U TW 110210571U TW M624658 U TWM624658 U TW M624658U
- Authority
- TW
- Taiwan
- Prior art keywords
- user
- term
- short
- users
- value
- Prior art date
Links
- 230000007774 longterm Effects 0.000 claims abstract description 52
- 238000012545 processing Methods 0.000 claims description 31
- 238000010801 machine learning Methods 0.000 claims description 27
- 238000004422 calculation algorithm Methods 0.000 claims description 15
- 230000002123 temporal effect Effects 0.000 claims description 6
- 239000000284 extract Substances 0.000 claims description 5
- 230000008859 change Effects 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 2
- 239000013589 supplement Substances 0.000 claims description 2
- 238000000034 method Methods 0.000 description 43
- 230000003442 weekly effect Effects 0.000 description 33
- 230000006399 behavior Effects 0.000 description 16
- 238000010586 diagram Methods 0.000 description 11
- 238000003860 storage Methods 0.000 description 10
- 238000004458 analytical method Methods 0.000 description 9
- 230000000694 effects Effects 0.000 description 8
- 238000007726 management method Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 3
- 238000013145 classification model Methods 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 239000000047 product Substances 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000006227 byproduct Substances 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000013277 forecasting method Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 238000012731 temporal analysis Methods 0.000 description 1
- 230000036962 time dependent Effects 0.000 description 1
- 238000000700 time series analysis Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000012384 transportation and delivery Methods 0.000 description 1
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
本新型涉及一種用於預測用戶是否屬於某一個價值用戶群的預測裝置,且特別是一種以用戶短期特徵預測用戶是否屬於價值用戶群的預測裝置。 The present invention relates to a prediction device for predicting whether a user belongs to a certain value user group, and particularly a prediction device for predicting whether a user belongs to a value user group based on short-term characteristics of the user.
在傳統的用戶管理系統中,往往要等待到每月/季/半年/年的用戶帳單,才得以了解各個用戶對商品或服務提供者(亦即,提供服務或商品的企業)的價值貢獻程度。然而,現在用戶的行為變化迅速,以往等待到每月/季/半年/年的用戶帳單才判斷用戶屬於哪種價值用戶族群的做法,會讓商品或服務提供者迷失於統計數據,而導致無法即時針對當前用戶環境進行用戶服務優化,並促成用戶增加消費的機會。 In the traditional user management system, it is often necessary to wait until the monthly/quarterly/semi-annual/annual user bills to understand the value contribution of each user to the commodity or service provider (that is, the enterprise that provides the service or commodity). degree. However, users' behaviors are changing rapidly. In the past, the practice of waiting until monthly/quarterly/semi-annual/annual user bills to determine which value user group a user belongs to would make the product or service provider lose sight of the statistical data, resulting in It is impossible to instantly optimize user services for the current user environment and promote opportunities for users to increase consumption.
舉例來說,依據每季的帳單,用戶可能是屬於購買金額較高的價值用戶群,但如果等到下一季帳單才查看用戶是否仍屬於購買金額較高的價值用戶群,則有可能已經流失該用戶。簡單地說,若能即時地了解 用戶的行為,則比較有機會維持高價值用戶,甚至可能將準高價值用戶趁早變成高價值用戶。 For example, based on quarterly billing, a user may belong to a high-value group with high purchases, but if you wait until the next quarterly bill to see if the user is still in a high-value group, you may have Churn this user. Simply put, if one can instantly understand The behavior of users has a better chance of maintaining high-value users, and may even turn quasi-high-value users into high-value users as soon as possible.
為了解決先前技術的問題,本新型提供一種以用戶短期特徵預測用戶是否屬於價值用戶群的預測方法,係執行於一預測裝置,包括:獲取一分群結果,其中所述分群結果表示至少一用戶屬於一價值用戶群;基於一機器學習算法實現一特徵工程,以獲取所述價值用戶群的多個用戶特徵,並透過所述多個用戶特徵建立所述價值用戶群的一第一分群模型,其中所述多個用戶特徵至少一者為一用戶長期特徵,所述用戶長期特徵為在一目標時間到達才能獲得的一特徵;自第一分群模型中萃取出多個用戶短期特徵,並以所述多個用戶短期特徵建立所述價值用戶群的一第二分群模型,其中所述用戶短期特徵為在所述目標時間到達前便能獲得的一特徵;以及使用所述第二分群模型根據另一用戶的多個用戶短期特徵預測所述另一用戶是否屬於所述價值用戶群。 In order to solve the problems of the prior art, the present invention provides a prediction method for predicting whether a user belongs to a valuable user group based on short-term characteristics of the user. The method is executed in a prediction device, and includes: acquiring a grouping result, wherein the grouping result indicates that at least one user belongs to A value user group; a feature engineering is implemented based on a machine learning algorithm to obtain a plurality of user characteristics of the value user group, and a first grouping model of the value user group is established through the plurality of user characteristics, wherein At least one of the plurality of user features is a long-term feature of the user, and the long-term feature of the user is a feature that can only be obtained after reaching a target time; a plurality of short-term features of the user are extracted from the first clustering model, and the establishing a second grouping model of the valuable user group with short-term characteristics of a plurality of users, wherein the short-term characteristics of the users are a characteristic that can be obtained before the target time arrives; and using the second grouping model according to A plurality of user short-term characteristics of another user predict whether the other user belongs to the value user group.
於本新型實施例中,根據多個用戶在所述目標時間到達時得到的多個用戶資料對所述多個用戶分群,以獲取所述分群結果。 In this novel embodiment, the plurality of users are grouped according to the plurality of user data obtained by the plurality of users when the target time arrives, so as to obtain the grouping result.
於本新型實施例中,所述分群結果是由一RFM模型、或一機器學習算法進行分群。 In this novel embodiment, the grouping results are grouped by an RFM model or a machine learning algorithm.
於本新型實施例中,所述預測方法更包括:根據所述另一用戶被預測是否屬於所述價值用戶群來決定對所述另一用戶的一行銷策略。 In this novel embodiment, the prediction method further includes: determining a marketing strategy for the other user according to whether the other user is predicted to belong to the valuable user group.
於本新型實施例中,挑選所述價值用戶群中多個重要性為前幾名或超過一門限值的多個用戶特徵來做為所述價值用戶群的所述多個用戶特徵。 In the novel embodiment of the present invention, a plurality of user features in the valuable user group whose importance is the top few or exceeding a threshold value are selected as the plurality of user features in the valuable user group.
於本新型實施例中,挑選所述第一分群模型中多個重要性為前幾名或超過一門限值的多個用戶短期特徵來做為所述第二分群模型的所述多個用戶短期特徵。 In the new embodiment of the present invention, a plurality of short-term features of a plurality of users whose importance is the top few or exceeding a threshold value in the first grouping model are selected as the plurality of users of the second grouping model short-term characteristics.
於本新型實施例中,挑選所述第一分群模型中多個重要性為前幾名或超過一門限值的多個用戶短期特徵,以及自挑選出的所述多個用戶短期特徵延伸出多個用戶短期特徵,將挑選出的所述多個用戶短期特徵以及延伸出的所述多個用戶短期特徵做為所述第二分群模型的所述多個用戶短期特徵。 In the new embodiment of the present invention, a plurality of short-term features of users whose importance is the top few or exceeding a threshold value in the first grouping model are selected, and a plurality of short-term features of users are extended from the selected short-term features of users. short-term features of users, and the selected short-term features of users and the extended short-term features of users are used as the short-term features of users in the second grouping model.
於本新型實施例中,所述預測方法更包括:監控預測的所述價值用戶群的多個用戶與所述分群結果的所述價值用戶群的所述多個用戶之間的一交集於時間上的變化,以判斷所述多個用戶的一行銷策略是否正確。 In this novel embodiment, the prediction method further comprises: monitoring an intersection between the predicted multiple users of the value user group and the multiple users of the value user group of the grouping result over time. to determine whether the marketing strategies of the multiple users are correct.
於本新型實施例中,所述預測方法更包括:將預測的所述價值用戶群的多個用戶與所述分群結果的所述價值用戶群的所述多個用戶整合做為所述價值用戶群的所述多個用戶。 In this novel embodiment, the prediction method further includes: integrating multiple users of the predicted value user group with the multiple users of the value user group in the grouping result as the value user the plurality of users of the group.
於本新型實施例中,使用所述第二分群模型根據所述另一用戶的所述多個用戶短期特徵預測所述另一用戶是否屬於所述價值用戶群是透過一特徵比對,且所述特徵比對是透過一類神經網路分類或計算一餘弦相似性來實現。 In this novel embodiment, using the second grouping model to predict whether the other user belongs to the valuable user group according to the multiple short-term characteristics of the other user is through a feature comparison, and The feature comparison is achieved through a class of neural network classification or computing a cosine similarity.
根據本新型的目的,本新型提供一種非揮發性的儲存媒介,儲存有多個程式碼,所述多個程式碼係被一計算機裝置讀取,以執行上述預測方法的其中一者。 According to the purpose of the present invention, the present invention provides a non-volatile storage medium storing a plurality of code codes, which are read by a computer device to execute one of the above prediction methods.
根據本新型的目的,本新型提供一種以用戶短期特徵預測用戶是否屬於價值用戶群的預測裝置,係以一純硬體電路或一計算機裝置搭配一軟體實現,其係組態成多個模組,且多個模組運作以執行上述預測方法的其中一者。 According to the purpose of the present invention, the present invention provides a prediction device for predicting whether a user belongs to a value user group based on the short-term characteristics of the user. , and a plurality of modules operate to perform one of the above prediction methods.
相較於先前技術,本新型可以透過用戶短期特徵,如每日或單次消費行為,對用戶進行用戶價值評分預測(即,預測用戶是否屬於某一個價值用戶群),故可針對到用戶個人,讓用戶對現行服務的反應更加即時全面,無須等到最終消費結果產生才反應。 Compared with the prior art, the new model can predict the user's value score (ie, predict whether the user belongs to a certain value user group) through the short-term characteristics of the user, such as daily or single consumption behavior, so it can target the individual user. , so that users can respond to the current service more instantly and comprehensively, without waiting for the final consumption result to respond.
1:價值用戶分群系統 1: Value user grouping system
11:平台裝置 11: Platform device
11S:用戶資料集 11S: User data set
111:瀏覽行為 111: Browsing Behavior
112:興趣喜好 112: Interests
113:裝置資訊 113: Device Information
114:網際網路位址(IP)或裝置識別 114: Internet address (IP) or device identification
12:用戶管理系統 12: User management system
12S:用戶資料集 12S: User Data Set
121:人口統計資料 121: Demographics
122:消費紀錄 122:Consumption record
123:帳務紀錄 123: Accounting records
124:互動紀錄 124: Interactive Record
13:公開資料來源裝置 13: Open Source Device
13S:用戶資料集 13S: User Data Set
131:氣候地理 131: Climatic Geography
132:金融經濟 132: Financial Economics
133:人文人口 133: Humanities Population
134:醫療生物 134: Medical Biology
14:資料庫 14:Database
15:預測裝置 15: Prediction device
151:數據處理模組 151: Data processing module
152:機器學習模組 152: Machine Learning Modules
153:模型部署模組 153: Model Deployment Module
154:行銷模組 154: Marketing Module
21:用戶短期特徵集 21: User short-term feature set
211~21M:用戶短期特徵 211~21M: short-term characteristics of users
22:用戶群 22: User group
231~23N:價值用戶群 231~23N: Value user group
S31~S35:步驟 S31~S35: Steps
41:當前RFM分群結果 41: Current RFM clustering results
411:高價值用戶群 411: High-value user group
42:預測的RFM分群結果 42: Predicted RFM clustering results
421:高價值用戶群 421: High-value user group
51:當前的價值用戶群 51: Current value user base
52:預測的價值用戶群 52: Predicted value user base
531~533:交集 531~533: Intersection
圖1是本新型實施例的價值用戶分群系統的方塊圖。 FIG. 1 is a block diagram of a value user grouping system according to an embodiment of the present invention.
圖2是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的概念示意圖。 FIG. 2 is a conceptual schematic diagram of grouping valuable users based on short-term characteristics of users according to the novel embodiment of the present invention.
圖3是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法的流程圖。 FIG. 3 is a flowchart of a method for predicting value user groups based on short-term characteristics of users according to the novel embodiment of the present invention.
圖4是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法或裝置建立的RFM預測分群與當前RFM分群的示意圖。 4 is a schematic diagram of an RFM prediction group and a current RFM group established by a method or apparatus for predicting value user groups based on short-term characteristics of users according to a novel embodiment of the present invention.
圖5A是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法或裝置於第一時間獲得之價值用戶群及當前的價值用戶群的示意圖。 5A is a schematic diagram of the value user group obtained at the first time and the current value user group obtained by the method or device for predicting value user groups based on the short-term characteristics of users according to the novel embodiment of the present invention.
圖5B是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法或裝置於第二時間獲得之價值用戶群及當前的價值用戶群的示意圖。 5B is a schematic diagram of a value user group and a current value user group obtained at a second time by a method or device for predicting value user groups based on short-term characteristics of users according to the novel embodiment of the present invention.
圖5C是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法或裝置於第二時間獲得之價值用戶群及當前的價值用戶群的另一示意圖。 5C is another schematic diagram of the value user group and the current value user group obtained at a second time by the method or device for predicting value user groups based on short-term characteristics of users according to the novel embodiment of the present invention.
下面結合附圖對本新型的較佳實施例進行詳細闡述,以使本新型的優點和特徵能更易於被本領域技術人員理解,藉以對本新型的保護範圍做出更為清楚明確的界定。 The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings, so that the advantages and features of the present invention can be more easily understood by those skilled in the art, and the protection scope of the present invention can be more clearly defined.
本新型透過機器學習方法,透過蒐集並分析用戶資料(包括用戶短期資料、用戶長期資料與用戶非時間相關資料),以解析用戶行為(對應於包括用戶長期特徵、用戶短期特徵或與用戶非時間相關特徵的用戶特徵)來建立第一分群模型,並透過建立的第一分群模型基於用戶的用戶資料對用戶分群。接著,透過第一分群模型獲取用戶價值評分與用戶之用戶短期特徵之相關性,此相關性可理解為驗證用戶之用戶短期特徵是否對於用戶價值評分有顯著貢獻,若有顯著貢獻,顯示透過用戶短期特徵重新建立的第二分群模型具有解釋力,並可用於根據用戶的用戶短期特徵(代表著短 期的用戶行為)來預測用戶價值(即,預測用戶是否屬於某一個價值用戶群,例如高價值用戶群)。例如,用戶每週會登入網頁次數是否跟用戶終身價值有關,或者,用戶每次登入頁面都會在首頁停留5秒以上是否就會比較願意多花錢購買。透過特徵工程由用戶資料(包含用戶短期資料與用戶長期資料)進行用戶行為解析,以萃取出重要且有意義的用戶特徵(包含用戶短期特徵、用戶長期特徵與用戶非時間相關特徵)。例如,以每日/週/月/季/半年/年來切分用戶資料為用戶短期資料與用戶長期資料,不同的商業情境所在意的時間長短不同,如對電商零售來說,每週可能算長期,每小時才算短期,而對金融業來說,每三年可能算長期,每週才算短期。簡單地說,用戶短期資料與用戶長期資料之有關短期與長期的定義會因為不同商業情境而有不同。另外,用戶非時間相關資料則是指與時間無關的資料,例如用戶的戶籍地地址,以及用戶非時間相關特徵則是指與時間無關的特徵。 This new model uses machine learning methods to collect and analyze user data (including user short-term data, user long-term data and user non-time-related data) to analyze user behavior (corresponding to including user long-term characteristics, user short-term characteristics or non-temporal user data). User characteristics of related characteristics) to establish a first grouping model, and group users based on user data of users through the established first grouping model. Then, the correlation between the user value score and the user's short-term user characteristics is obtained through the first clustering model. This correlation can be understood as verifying whether the user's short-term user characteristics have a significant contribution to the user value score. The second clustering model rebuilt by short-term features has explanatory power and can be used according to the user's short-term features (representing short-term features). user behavior in the future) to predict user value (ie, predict whether a user belongs to a certain value user group, such as a high-value user group). For example, whether the number of times users log on to the web page per week is related to the user's lifetime value, or whether users are more willing to spend more if they stay on the homepage for more than 5 seconds each time they log on the page. Through feature engineering, user behavior analysis is performed from user data (including user short-term data and user long-term data) to extract important and meaningful user features (including user short-term features, user long-term features, and user non-time-related features). For example, divide user data into short-term user data and long-term user data on a daily/weekly/monthly/quarterly/semi-annual/yearly basis. Different business scenarios require different lengths of time. For example, for e-commerce retail, weekly In the long term, every hour is short term, whereas for the financial industry, every three years may be long term and weekly is short term. Simply put, the definition of short-term and long-term between short-term user data and long-term user data will be different for different business situations. In addition, the user's non-time-related data refers to time-independent data, such as the user's residence address, and the user's non-time-related features refer to time-independent features.
於本新型中,先運用用戶資料對用戶進行價值用戶的分群,將每一個用戶分派到多個價值用戶群的其中一個。以一分群或分類模型(例如,RFM模型(Recency Frequency Monetary Model,即以近一次消費、消費頻率、消費金額等數值分群的分類模型)來說,用戶可能被分群到例如高價值(VIP)用戶群、高消費用戶群、常購用戶群、近期高活躍用戶群或其他價值用戶群。接著,建立某一個價值用戶群(例如,有興趣的是高價值用戶群)的第一分群模型。之後,進行特徵工程,使用第一分群模型找出對此價值用戶群來說是重要且有意義的用戶短期特徵,並以這些重要且有意義的用戶短期特徵建模僅關聯於用戶短期行為的第二分群模型。接著,基於第二分群模型,根據當前用戶之用戶短期特徵預測出用戶中長期顧客價值評 分,即使用第二分群模型根據當前用戶之用戶短期特徵預測用戶是否屬於第二分群模型的價值用戶群。 In the present invention, users are firstly grouped into valuable user groups by using user data, and each user is assigned to one of a plurality of valuable user groups. In terms of a grouping or classification model (for example, an RFM model (Recency Frequency Monetary Model, that is, a classification model that groups values such as recent consumption, consumption frequency, consumption amount, etc.), users may be grouped into, for example, high-value (VIP) user groups , high consumption user group, frequent purchase user group, recent high active user group or other value user group. Next, establish the first grouping model of a certain value user group (for example, the high value user group who are interested). After that, Perform feature engineering, use the first clustering model to find important and meaningful short-term user characteristics for this value user group, and use these important and meaningful short-term user characteristics to model the second clustering that is only related to the user's short-term behavior Then, based on the second clustering model, according to the short-term user characteristics of the current user, predict the user's mid- and long-term customer value evaluation The second grouping model is used to predict whether the user belongs to the valuable user group of the second grouping model according to the short-term user characteristics of the current user.
上述使用第二分群模型根據當前用戶短期特徵預測用戶是否屬於第二分群模型的價值用戶群是透過特徵比對進行,第二分群模型可以是透過分類算法或迴歸算法所得到之分群模型或其他機器學習算法訓練好的類神經網路。另外,特徵比對也可以是透過計算用戶之多個用戶短期特徵的數值與第二分群模型之用戶短期特徵的數值之間的相關性,如可計算餘弦相似性(cosine similarity),便可以預測用戶是否屬於第二分群模型的價值用戶群。 The above-mentioned use of the second grouping model to predict whether the user belongs to the valuable user group of the second grouping model according to the short-term characteristics of the current user is carried out through feature comparison. The second grouping model can be a grouping model obtained by a classification algorithm or a regression algorithm. Or neural-like networks trained by other machine learning algorithms. In addition, the feature comparison can also be performed by calculating the correlation between the values of the short-term features of the users and the values of the short-term features of the users in the second clustering model. For example, the cosine similarity can be calculated. Predict whether the user belongs to the value user group of the second grouping model.
簡單地說,不論是使用已定義的RFM模型、第一分群模型或是第二分群模型,都是可以用於對用戶的用戶價值進行評分,RFM模型與第一分群模型可利用用戶長期資料與用戶長期特徵進行分群,在商業競爭激烈的現況下,RFM模型與第一分群模型可能要等到有用戶長期資料才能對用戶做有效且精準行銷,故可能會讓有價值的用戶流失。因此,本新型可使用第二分群模型依據用戶短期資料(即依據用戶短期特徵)提早對用戶的價值評分,以對用戶做有效且精準行銷,增加用戶的消費金額並有效地避免有價值的用戶流失。 Simply put, whether the defined RFM model, the first grouping model or the second grouping model is used, it can be used to score the user value of users. The RFM model and the first grouping model can use long-term user data. Grouping with the long-term characteristics of users. Under the current situation of fierce business competition, the RFM model and the first grouping model may not be able to perform effective and accurate marketing to users until there are long-term user data, so valuable users may be lost. Therefore, the present invention can use the second grouping model to score the user's value in advance according to the user's short-term data (that is, according to the user's short-term characteristics), so as to carry out effective and accurate marketing to the user, increase the consumption amount of the user and effectively avoid valuable User churn.
對價值用戶群來說是重要有意義的用戶特徵的意思可指重要性為前幾名或重要性超過一定門限值的用戶特徵,例如,購買金額對高價值用戶群為重要且有意義的用戶特徵,但是使用現金或信用卡進行消費的用戶特徵可能對高價值用戶群來說不重要且無意義。在建立分群模型時,可定義出各種用戶特徵,以在進行分群模型評估時,可以了解分群模型是 否具有解釋力(即分群的精確度是否足夠)。如果解釋力不夠,則可進一步地增加用戶特徵。然而,通常一個分群模型中,有些用戶特徵對分群模型的價值用戶群來說是不重要且無意義的。 The meaning of user features that are important and meaningful to valuable user groups can refer to user features whose importance is the top few or whose importance exceeds a certain threshold. For example, the purchase amount is an important and meaningful user feature for high-value user groups. But user characteristics for spending with cash or credit cards may be unimportant and meaningless to high-value user groups. When building a swarming model, various user characteristics can be defined so that when evaluating a swarming model, you can understand whether the swarming model is Whether it has explanatory power (that is, whether the accuracy of the clustering is sufficient). If the explanatory power is not enough, the user features can be further increased. However, usually in a segmentation model, some user characteristics are unimportant and meaningless to the value user group of the segmentation model.
使用分群模型對用戶進行預測的意思可指對用戶的用戶價值評分,即進行特徵比對,以預測用戶是否屬於分群模型的價值用戶群。特徵比對可指計算用戶之用戶特徵的數值與分群模型的價值用戶群的用戶特徵的數值之間的相似性,也可指所計算出相似性的機率結果。當然,特徵比對的作法有非常多種,且本新型不以類神經網路計算出的機率結果或以計算餘弦相似性的作法為限制,其他以支持向量機或決策樹實現的特徵比對作法也在本新型的範圍之內。 The meaning of using the grouping model to predict users may refer to scoring the user's user value, that is, performing feature comparison to predict whether the user belongs to the value user group of the grouping model. The feature comparison may refer to calculating the similarity between the value of the user feature of the user and the value of the user feature of the value user group of the grouping model, or may refer to the calculated probability result of the similarity. Of course, there are many methods of feature comparison, and the new model is not limited by the probability results calculated by the neural network or the method of calculating cosine similarity, and other methods of feature comparison implemented by support vector machines or decision trees. also within the scope of the present invention.
於本新型中,短期的定義可指比想知道的答案短就叫短期,例如想知道哪些用戶在月底貢獻度較大,若使用一週、兩週或三週的用戶特徵進行預測,那麼一週、兩週或三週相較一個月來說就是短期,也就是比目標期間短的期間都可叫短期,目標期間則叫長期。需要說明的是,有些特徵是與時間無關者,通常不會是用戶長期特徵或用戶短期特徵,而是用戶非時間相關特徵,例如若將用戶的戶籍地地址作為用戶特徵之一,則用戶的戶籍地地址不是用戶短期特徵也不是用戶長期特徵。用戶短期特徵專指有相對時間概念的特徵。 In this new model, the definition of short-term can refer to the short-term answer that is shorter than the answer you want to know. For example, you want to know which users contribute more at the end of the month. Two weeks or three weeks are short-term compared to one month, that is, a period shorter than the target period can be called short-term, and the target period is called long-term. It should be noted that some features are not related to time, usually not long-term features of users or short-term features of users, but non-time-related features of users. Household address is neither a short-term feature nor a long-term feature of the user. User short-term features specifically refer to features with a relative time concept.
於本新型中,用戶資料通常是原始資料,而不會是用戶特徵,除非一開始設計資料庫儲存用戶特徵作為用戶資料,例如資料庫一開始就設計成會計算用戶來店次數與每週消費平均,並儲存用戶來店次數與每週消費平均作為用戶資料使用。特徵是具有代表性的資料,可理解為輸入參 數或變數,有意義的輸入參數或變數會影響到最終的用戶價值評分,例如用戶的瀏覽行為特徵,可能是進到網頁後常常點擊廣告內容、或者喜歡觀看影片、或者停留在網頁的時間不超過10秒等等,都可能是有意義的特徵。 In this new model, user data is usually original data, not user features, unless the database is initially designed to store user features as user data. Average, and store the number of user visits to the store and the average weekly consumption as user data. Features are representative data, which can be understood as input parameters. Meaningful input parameters or variables will affect the final user value score, such as the user's browsing behavior characteristics, which may be that they often click on advertisement content after entering the web page, or like to watch videos, or stay on the web page for no longer than 10 seconds, etc., could be meaningful features.
常見的特徵,從類型上可分為數值類型、類別類型與時間類型。此三類型的特徵可以彼此轉換,例如,數值類型的特徵可以進行處理,而轉換成時間類型的特徵。數值類型的特徵是利用統計描述方法,將整體資料範圍切分成數分。類別類型的特徵是指屬於何種類別,例如,裝置資訊可能是筆電、手機或平板,而裝置為筆電、手機與平板的哪一者即是類別類型的特徵。另外,數值類型的特徵也能轉換成類別類型的特徵,或者類別類型的特徵也可以依照各類別比重作特徵加權,以轉換成數值類型的特徵。時間類型的特徵可以是將數值特徵依照每週/每月/每季等方法計算頻率、次數或大小等,其計算方法非常多種,且不以本新型所述的作法為限制。簡單地說,三種類型之特徵的任一者都可以經過處理,以生成其中另一者的特徵。 Common features can be divided into numerical type, category type and time type in terms of type. These three types of features can be converted to each other, for example, the features of the numerical type can be processed and converted into the features of the time type. The characteristic of the numerical type is to use the statistical description method to divide the overall data range into scores. The feature of the category type refers to what category it belongs to, for example, the device information may be a laptop, a mobile phone or a tablet, and which one of the device is a laptop, a mobile phone and a tablet is the feature of the category type. In addition, the features of the numerical type can also be converted into the features of the category type, or the features of the category type can also be weighted according to the weight of each category, so as to be converted into the features of the numerical type. The time-type feature may be the frequency, number of times, or the size of the numerical feature based on weekly/monthly/quarterly calculation methods, etc. There are many calculation methods, and the method described in the present invention is not limited. Simply put, any of the three types of features can be processed to generate features of the other.
本新型使用用戶短期特徵的目的是希望利用短期間特徵參數來預測長時間的結果,用戶短期特徵雖與時間相關,但因為特徵類型可以互相轉換,故廣義上來說用戶短期特徵可包含前述數值類型、類別類型與時間類型之特徵,但可以是以相較於目標時間(長期)的較短時間(短期)切分出來的特徵。用戶短期特徵是相對的概念,例如一般可能要等到月底才知道用戶當月的最終花費,但從用戶每週的消費紀錄、頻率、週期、來店次數、瀏覽頻率、點擊次數與頻率、是否有加入購物車或願望清單,以及用戶過去的歷史紀錄等等,都有很大的機會預測出用戶月底的購買紀錄。 找出重要的用戶短期特徵後,便可據此建立自動化流程,實際使用時則是自動化從用戶資料轉換與萃取出用戶短期特徵。 The purpose of using user short-term features in this new model is to use short-term feature parameters to predict long-term results. Although user short-term features are related to time, because feature types can be converted to each other, in a broad sense, user short-term features can include the aforementioned numerical types , the features of category type and time type, but the features can be segmented by a shorter time (short-term) than the target time (long-term). The short-term characteristics of users are relative concepts. For example, it is generally possible to wait until the end of the month to know the final expenditure of the user in the current month, but from the user's weekly consumption records, frequency, cycle, number of visits, browsing frequency, clicks and frequency, and whether there is a subscription. Shopping carts or wish lists, as well as the user's past history, etc., have a good chance of predicting the user's purchase history at the end of the month. After identifying the important short-term characteristics of users, an automated process can be established accordingly. In actual use, it is to automatically convert and extract short-term user characteristics from user data.
於本新型中,延伸出來的特徵可指原本不存在的特徵,延伸特徵可指用既有特徵或既有資料進行計算、轉換或擷取出來的特徵。過於細膩的資料可能難以直接作為特徵使用,因此,可對資料進行轉換產生例如,一般地址資料可能會包含許多資訊,例如總統府地址為:「106台北市大安區新生南路二段1號」,若直接以這項資料來進行建模,將會因為資料太過細緻而無法運用,這時會做一些特徵轉換處理,可能將原始的「地址:106台北市大安區新生南路二段1號」轉變成「郵遞區號:106、城市:台北市、鄉鎮市區大安區、其他地址:新生南路二段1號」或可能透過經緯度查詢獲得較粗略定位資訊的「經緯度:25N/121E」。延伸的特徵,以用於模型的訓練與處理。又舉一例:一既有特徵為「每28日登入頻率=10」,則可以延伸出「每週平均登入頻率=2」或者「每雙週平均登入頻率=4」的延伸特徵。又如一既有或原始資料為「ID:0001;2020-9-1 10:00:05;登入首頁;使用裝置device=iphone12」,可延伸出與「使用者識別碼」、「登入時間」、「登入次數」、「使用裝置」相關之多個延伸特徵,其中例如「登入時間」之特徵可包含多個以不同時段、週期等計算之延伸特徵,以及「使用裝置」之特徵可包含與「裝置品牌」及「裝置型號」相關之延伸特徵等。 In the present invention, an extended feature may refer to a feature that does not exist originally, and an extended feature may refer to a feature calculated, converted or extracted using existing features or existing data. Data that is too detailed may be difficult to use directly as a feature. Therefore, the data can be converted to produce. For example, general address data may contain a lot of information, such as the address of the Presidential Office: "106 No. 1, Section 2, Xinsheng South Road, Daan District, Taipei City", If the data is directly used for modeling, the data will be too detailed to be used. At this time, some feature conversion processing will be performed, which may convert the original "Address: No. 1, Section 2, Xinsheng South Road, Daan District, Taipei City, 106". "Postal Code: 106, City: Taipei City, Daan District, Township Urban District, Other Address: No. 1, Section 2, Xinsheng South Road" or "latitude and longitude: 25N/121E" which may obtain rough positioning information through longitude and latitude query. Extended features for model training and processing. Another example: if the existing feature is "login frequency every 28 days = 10", the extended feature of "average login frequency per week = 2" or "average login frequency per biweekly = 4" can be extended. Another example is that the existing or original data is "ID: 0001; 2020-9-1 10:00:05; log in to the homepage; use device device=iphone12", which can be extended with "user ID", "login time", Multiple extended features related to "login times" and "use device", wherein, for example, the feature of "login time" can include multiple extended features calculated in different time periods and periods, and the feature of "use device" can include the feature related to " Device brand” and “device model” related extension features, etc.
另舉一例說明延伸特徵的例子,假設在數值迴歸應用中,評估一個晶片效能的模型的既有特徵有電阻R與電流I,若有相關領域的技術人員提出功率P也是一個重要且有意義的特徵,此時可以延伸出一個功率 P=I2R的特徵作為評估一個晶片效能的模型的特徵。又舉一例:網路行銷的情境中,總曝光次數I及總點擊次數TC為既有特徵,可延伸出點閱率CTR=TC/I的延伸特徵作為評估廣告、關鍵字等的效能。再舉一例:一用戶資料包括「ID:0001;2020-9-1 10:00:05;登入首頁」、「ID:0001;2020-9-1 10:00:15;離開首頁」、「ID:0001;2020-9-5 14:08:30;登入首頁」、「ID:0001;2020-9-5 14:08:30;消費;消費金額=1,000」與「ID:0001;2020-9-5 14:08:50;離開首頁」,依據此原始數據中所記錄之造訪狀況,即對登入/離開首頁之時間進行分析,延伸特徵可包含「七天內登入首頁次數」、「日均登入首頁次數」、「九月登入首頁次數」、「九月第一週登入首頁次數」、「日間時段造訪頻率」、「每次登入停留時間」、「工作階段(session)長度」等。簡單地說,在本新型中,也會類似上述進行特徵延伸的作法,在既有的用戶資料或已預先定義的用戶特徵,再延伸出有對分群模型來說重要且有意義的特徵。 Another example is given to illustrate the extended feature. Suppose that in a numerical regression application, the existing features of a model for evaluating a chip's performance include resistance R and current I. If someone skilled in the relevant field proposes that power P is also an important and meaningful feature , at this time, a feature of power P=I 2 R can be extended as a feature of a model for evaluating the performance of a chip. Another example: in the context of Internet marketing, the total number of impressions I and the total number of clicks TC are existing features, and the extended feature of CTR=TC/I can be extended to evaluate the effectiveness of advertisements, keywords, etc. Another example: a user data includes "ID: 0001; 2020-9-1 10: 00: 05; log in to the home page", "ID: 0001; 2020-9-1 10: 00: 15; leave the home page", "ID : 0001; 2020-9-5 14: 08: 30; log in to the home page", "ID: 0001; 2020-9-5 14: 08: 30; consumption; consumption amount = 1,000" and "ID: 0001; 2020-9 -5 14:08:50; leave the homepage", according to the visit status recorded in the original data, that is to analyze the time of logging in/leaving the homepage. The number of homepages, the number of homepage logins in September, the number of homepage logins in the first week of September, the frequency of visits during the day, the duration of each login, and the length of the session. In short, in the present invention, similar to the above-mentioned method of feature extension, the existing user data or predefined user features are extended to have important and meaningful features for the grouping model.
       請參閱圖1,圖1是本新型實施例的價值用戶分群系統的方塊圖。價值用戶分群系統1係搭配用戶服務系統(圖未繪示)使用,用戶服務系統用於提供服務給用戶,而價值用戶分群系統1則用於根據服務的用戶之用戶資料來對用戶進行分群,以預測用戶是否屬於某一個價值用戶群,從而決定行銷策略,增加交易與用戶的滿意度。上述提供的服務例如是線上或實體購物、線上或實體交易、線上或實體投資、線上或實體訂閱或線上或實體諮詢等,亦即,用戶服務系統可以包括線上的服務主機或實體店面的銷售點終端(POS),且本新型不以此為限制。換言之,價值用戶分群系統1可以對使用線上服務或實體服務的用戶進行價值用戶的分群。
Please refer to FIG. 1 , which is a block diagram of a value user grouping system according to a new embodiment of the present invention. The value 
       價值用戶分群系統1包括平台裝置11、用戶管理系統12、公開資料來源裝置13、資料庫14與以用戶短期特徵預測用戶是否屬於價值用戶群的預測裝置15,其中資料庫14信號連接平台裝置11、用戶管理系統12、公開資料來源裝置13與預測裝置15。另外,本新型中的「信號連接」是指透過軟體或硬體方式的無線或有線連接,使得信號或資訊可以彼此在信號連接的多個元件中互相傳遞。
The value 
       平台裝置11是企業平台或其他類型的平台,其用以獲取用戶的瀏覽行為111、興趣喜好112、裝置資訊113、網際網路位址(IP)或裝置識別(device identification)114之用戶資料。用戶管理系統12用於獲取用戶的人口統計資料121、消費紀錄122、帳務紀錄123與互動紀錄124之用戶資料。公開資料來源裝置13用於獲取用戶的氣候地理131、金融經濟132、人文人口133與醫療生物134之用戶資料。平台裝置11、用戶管理系統12與公開資料來源裝置13獲取的用戶的多個用戶資料可以被資料庫14所儲存,資料庫14可以是非結構化資料庫,但本新型不以此為限制,亦可能是結構化資料庫。
The platform device 11 is an enterprise platform or other type of platform, which is used to obtain user information of 
       預測裝置15通常是透過軟體配合計算機裝置的方式來實現。例如,計算機裝置包括運算單元、儲存單元與通訊單元,其中運算單元電性連接儲存單元與通訊單元。運算單元會讀取非揮發性儲存媒介所儲存的多個程式碼,多個程式碼被執行後,能夠運行以用戶的用戶短期特徵來預測用戶的價值評分的方法(即,以用戶的用戶短期特徵來預測用戶是否屬於某一個價值用戶群)。另外,預測裝置15也可以透過硬體電路來實現,例如,透過設計Verilog或VHDL碼對場可程式化陣列(FPGA)晶片燒錄,以實現純硬體電路的預測裝置15,又例如,透過製作特定應用晶片(ASIC)來實現純
硬體電路的預測裝置15。總而言之,預測裝置15是使用軟體還是硬體來實現,其皆非用於限制本新型。
The 
       不管預測裝置15使用軟體或硬體方式來實現,其可以劃分出多個模組如圖1。於圖1中,預測裝置15包括數據處理模組151、機器學習模組152、模型部署模組153與行銷模組154,數據處理模組151、機器學習模組152、模型部署模組153與行銷模組154除了彼此信號連接之外,還信號連接於資料庫14。當然,圖1之預測裝置15的模組劃分僅是實現本新型的其中一種方式,且圖1的做法並非用於限制本新型。另外,數據處理模組151、機器學習模組152、模型部署模組153與行銷模組154將與本新型圖3之以用戶短期特徵對用戶進行價值用戶分群的預測方法一併做說明,此處先不贅述。
Regardless of whether the 
       在說明本新型之預測裝置15與圖3之預測方法的細節之前,請先參照本新型圖2,圖2是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的概念示意圖。在圖2中,多個用戶資料中的人口統計資料121、消費紀錄122、帳務紀錄123與互動紀錄124的用戶資料集12S被用來對多個用戶進行價值用戶的分群,每一個用戶會被分群到多個價值用戶群231、232、...、23N的其中一者,其中N為大於等於2的正整數。價值用戶的分群方式例如可以是採用RFM模型,也可是使用其他模型之分群方式,例如以靜態特徵(年齡、性別、職業、註冊時間點)分群,或以動態特徵(RFM參數、會員等級或晉升狀況、帳號餘額、所採用之商品之取貨/交付方式)分群,且本新型不以此為限制。
Before explaining the details of the 
       換句話說,於圖2中的右邊,可使用RFM模型先對多個用戶進行價值用戶的分群,其中RFM模型必須基於用戶資料中的用戶資料集12S所產生的用戶資料才能夠進行價值用戶的分群。用戶資料集12S的用戶
資料多數為用戶長期資料,因此,RFM模型至少要有用戶長期資料才能夠進行價值用戶的分群。另外,用戶資料集12S的用戶資料也可能有用戶短期資料與用戶非時間相關資料。
In other words, on the right side of Figure 2, the RFM model can be used to first group multiple users with valuable users, wherein the RFM model must be based on the user data generated by the 
       如前面所述,本新型的目的是希望能夠使用相較於用戶長期特徵之統計時間來得短的用戶短期特徵來提前知悉用戶群22的各用戶是否可能屬於多個價值用戶群231~23N的某一個群。因此,透過用戶資料集11S中的瀏覽行為111、興趣喜好112、裝置資訊113、網際網路位址或裝置識別114的用戶資料與用戶資料集13S中的氣候地理131、金融經濟132、人文人口133、醫療生物134的用戶資料可以產生出包括多個用戶短期特徵211~21M的用戶短期特徵集21。
As mentioned above, the purpose of this new model is to use short-term user characteristics that are shorter than the statistical time of users' long-term characteristics to know in advance whether each user of the 
       為了使得可以使用用戶短期特徵集21中的至少一部分用戶短期特徵211~21M來預測用戶是否屬於多個價值用戶群231~23N的某一個群,本新型是將價值用戶群231~23N中有興趣的群(也可能全部都有興趣)建立對應的第一分群模型,並對第一分群模型進行特徵工程,找出用戶短期特徵建立第二分群模型,以使用第二分群模型根據用戶的用戶短期特徵預測用戶是否屬於價值用戶群231~23N的某一個群。例如,針對價值用戶群231,找出對價值用戶群231來說是重要且有意義的用戶特徵(包括用戶長期特徵、用戶短期特徵與用戶非時間相關特徵),然後進行特徵工程,找出第一分群模型中的用戶短期特徵,並使用這些用戶短期特徵建立價值用戶群231的第二分群模型,之後,透過使用價值用戶群231的第二分群模型根據用戶的用戶短期特徵預測用戶是否屬於價值用戶群231。
In order to make it possible to use at least a part of the short-term user features 211-21M in the user short-term feature set 21 to predict whether the user belongs to a certain group of the multiple value user groups 231-23N, the present invention uses the value user groups 231-23N interested in (maybe all of them are interested) to establish a corresponding first grouping model, and perform feature engineering on the first grouping model to find out the short-term characteristics of users to establish a second grouping model, so as to use the second The user's short-term characteristics predict whether the user belongs to a certain group of the value user groups 231-23N. For example, for the 
       請接著參照本新型的圖1與圖3,圖3是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法的流程圖。首先,在步驟S31中,數據處理模組151用於獲取分群結果,其中分群結果表示至少一個
價值用戶群所包括的用戶。進一步地說,數據處理模組151可以收集用戶資料(至少包括用戶長期資料),並根據用戶資料來將用戶分類至對應的價值用戶群。例如,透過RFM模型,即可以根據用戶資料將用戶分類為高價值(VIP)用戶群、高消費用戶群、常購用戶群、近期高活躍用戶群或其他價值用戶群。另外,步驟S31也可以是由數據處理模組151對用戶資料處理,以獲取用戶特徵(至少包括用戶長期特徵),且基於用戶特徵使用預設的分群模型來獲取分群結果。簡單地說,數據處理模組151取得分群結果的做法並非用於限制本新型。甚至,在本新型其他實施例中,分群可以不是由數據處理模組151進行,而是數據處理模組151僅接收分群結果。
Next, please refer to FIG. 1 and FIG. 3 of the present invention. FIG. 3 is a flowchart of a method for predicting value user groups for users based on short-term characteristics of users according to an embodiment of the present invention. First, in step S31, the 
       以一般電商常用的會員系統的情況,每位用戶在註冊後都會獲得一個使用者帳號,未來用戶在用戶服務系統中的所有行為活動都會記錄在資料庫14中,例如用戶何時消費、消費多少錢、消費的時間、地點與品項、消費前點擊了哪些頁面、放了多少品項進購物車與實際購買了多少品項等。數據處理模組151取得這些資料後,並於月底進行統計後,便能透過各種分群方式(例如,前述RFM模型,但也可以是以商品愛好分群或活動參與分群的作法),獲得分群結果。以RFM模型為例,RFM模型是根據消費總額、消費頻率與最近一次消費來將用戶分類為屬於高價值(VIP)用戶群、高消費用戶群、常購用戶群、近期高活躍用戶群或其他價值用戶群。
In the case of the membership system commonly used in general e-commerce, each user will obtain a user account after registration, and all future behaviors of the user in the user service system will be recorded in the 
       在步驟S32中,透過數據處理模組151與機器學習模組152建立價值用戶群的第一分群模型。根據分群結果,可以知道某一個價值用戶群的用戶有哪些,數據處理模組151可以處理此價值用戶群之用戶的用戶資料,根據系統預設或人為設定的規則,從用戶資料中獲取或辨識出多個用戶特徵(至少包括用戶長期特徵,但也有可能更包括用戶短期特徵與用戶非
時間相關特徵)。接著,機器學習模組152根據數據處理模組151所獲取的用戶特徵建立價值用戶群的第一分群模型。
In step S32, a first grouping model of the valuable user group is established through the 
       選擇性地,由於用戶特徵有些可能對於價值用戶群來說不是重要且有意義的,因此,數據處理模組151產生的用戶特徵若不是預先知悉一定會與價值用戶群有關的,則在步驟S32中,機器學習模組152可經過學習僅挑選出對於價值用戶群來說是重要且有意義的用戶特徵。可使用機器學習算法來進行特徵工程,萃取與確認對價值用戶群(例如高價值用戶群)來說是重要且有意義的用戶特徵有哪些。通常可進行特徵選取,依重要性分數選擇前面數名(或重要性超過門限值)的用戶特徵作為價值用戶群的重要用戶特徵,且價值用戶群的這些重要用戶特徵就是對價值用戶群來說是有意義且重要的用戶特徵。接著,利用這些重要用戶特徵即可以完成第一分群模型的建立。
Optionally, since some of the user characteristics may not be important and meaningful to the value user group, if the user characteristics generated by the 
       舉例來說,在知悉哪些用戶是屬於高價值用戶群,哪些用戶不是屬於高價值用戶群後,便可以透過萃取出對高價值用戶群來說是有意義且重要的用戶特徵,而對其他價值用戶群亦可用類似方式進行用戶特徵之萃取。萃取出之用戶特徵可如,高價值用戶群的用戶可能每天至少為瀏覽網頁超過10秒,或高價值用戶群的用戶每週至少會進到網頁首頁停留5秒以上,或高價值用戶群的用戶習慣透過手機登入而非個人電腦登入等。在知悉上面這些用戶特徵是重要的情況下,數據處理模組151會對用戶資料進行處理,以萃取出上述這些用戶特徵,並且機器學習模組152會依據這些用戶特徵建立出高價值用戶群的第一分群模型。
For example, after knowing which users belong to the high-value user group and which users do not belong to the high-value user group, it is possible to extract meaningful and important user characteristics for the high-value user group. Groups can also extract user features in a similar way. The extracted user characteristics can be, for example, the users of the high-value user group may browse the web page for at least 10 seconds per day, or the users of the high-value user group will enter the homepage of the web page for at least 5 seconds per week, or the high-value user group. Users are accustomed to logging in via mobile phones rather than PCs, etc. Knowing that the above user characteristics are important, the 
       用戶的用戶資料通常為原始資料,需要數據處理模組151進行處理與轉換。在此例子中,高價值用戶群的其中一個用戶的用戶資料包括「ID:0001;2020-9-1 10:00:05;登入首頁;使用裝置device=iphone12」、
「ID:0001;2020-9-1 10:00:15;離開首頁;使用裝置device=iphone12」、「ID:0001;2020-9-5 14:08:30;登入首頁;使用裝置device=iphone12」、「ID:0001;2020-9-5 14:08:30;消費;消費金額=1,000;使用裝置device=iphone12」與「ID:0001;2020-9-5 14:08:50;離開首頁;使用裝置device=iphone12」。
The user data of the user is usually original data, which needs to be processed and converted by the 
       數據處理模組151可以因此自動地萃取出「ID:0001」的用戶之特徵為「每週登入頻率=2」、「平均停留時間=15」、「每週首頁停留次數=2」與「每週消費金額=1,000」與「使用裝置device=iphone12」。於此例中,「使用裝置device=iphone12」可以視為用戶非時間相關特徵,若以月底才做統計,則「每週登入頻率=2」、「平均停留時間=15」、「每週首頁停留次數=2」與「每週消費金額=1,000」都屬於用戶短期特徵,要注意是,此處雖然沒有列出用戶長期特徵,但實際上,數據處理模組151也可取得用戶長期特徵,例如「每月登入頻率=9」、「平均停留時間=12」、「每月首頁停留次數=9」與「每月消費金額=3,000」。
The 
       在步驟S33中,基於第一分群模型,機器學習模組152進行特徵工程以萃取用戶短期特徵,並根據萃取的用戶短期特徵建立第二分群模型。在此請注意,第二分群模型也可以將用戶非時間相關特徵作為輸入參數,即機器學習模組152更可以萃取用戶非時間相關特徵,並使用萃取的用戶短期特徵與用戶非時間相關特徵建立第二分群模型。第一分群模型的用戶特徵若有用戶長期特徵、用戶短期特徵與用戶非時間相關特徵,則可以依據重要性挑選出第一分群模型中的用戶短期特徵(可選地,還可以挑選出重要且有意義的用戶非時間相關特徵),然後利用萃取出來的用戶特徵,即可以建立第二分群模型。特徵工程可以是透過機器學習算法來實現,且本新型不以此為限制。
In step S33, based on the first grouping model, the 
       在一些情況下,第二分群模型有可能不具有完整的解釋力,因為第一分群模型中可能沒有足夠的用戶短期特徵,甚至完全沒有用戶短期特徵。因此,步驟S33中,數據處理模組151更可以進行用戶資料的處理,根據第一分群模型的用戶特徵延伸出多個用戶短期特徵,以使得建立的第二分群模型更具備解釋力。舉例來說,第一分群模型中其中一個的用戶長期特徵為「每月登入頻率>9」,則可以延伸出「每週登入頻率>2」或者「每雙週登入頻率>4」的用戶短期特徵。另外,進行特徵延伸的部分,其中一種做法是繪製出雷達圖,針對第一分群模型中之用戶特徵的弱項類型去延伸用戶短期特徵,以使得第二分群模型具備足夠的解釋力。
In some cases, the second clustering model may not have complete explanatory power, because the first clustering model may not have enough short-term features of users, or even no short-term features of users at all. Therefore, in step S33, the 
舉例來說,若發現對高價值用戶群的第一分群模型來說,重要且有意義的用戶短期特徵為「每週登入頻率>2」、「平均停留時間>10」與「每週消費金額>1,000」,則使用這些短期用戶特徵來建立與訓練出第二分群模型。另外,若高價值用戶群的第一分群模型中有不重要且無意義的用戶特徵為「每週首頁停留次數=2」與「使用裝置device=iphone12」,則這些用戶特徵將不被用來訓練與建立第二分群模型,以避免過擬合。 For example, if it is found that for the first segmentation model of the high-value user group, the important and meaningful short-term characteristics of users are "weekly login frequency>2", "average dwell time>10" and "weekly consumption amount> 1,000”, then use these short-term user features to build and train a second clustering model. In addition, if there are unimportant and meaningless user features in the first grouping model of the high-value user group as "weekly homepage times = 2" and "use device device = iphone12", these user features will not be used for Train and build a second clustering model to avoid overfitting.
       數據處理模組151可根據一短期期間規則決定短期特徵的短期期間,短期期間可作為短期特徵值之觀察期間。數據處理模組151可根據一短期期間規則調整短期特徵的短期期間,例如數據處理模組151可針對特徵值之於時間軸上之分佈進行分析,以作為調整短期期間之依據。例如,數據處理模組151可對長期特徵資料集進行時間序列分析,如找出季節性等特徵,以採用相應之短期期間規則。短期期間可包含日曆期間,如小時、日、週、月、季等。短期時間亦可包含日曆期間之變化,如雙日、雙週等。用戶短期特徵不必然以固定週期來切分,若如瀏覽、造訪、消費等用戶行為非線性時,可利用非固定週期來設定短期期間。舉例而言,當用戶於每
月初於平台裝置11之行為較為活躍,惟活躍度隨時間至月底而遞減,若採固定週期切分特徵,接近月底之特徵值可能會使預測失準,例如一用戶習慣於每月第一週消費一金額V1>10,000元,惟至當月月底前消費金額遞減,例如第二至第四週消費金額分別為V2=2,000、V3=0、V4=0,若每月消費金額為長期特徵,則若將短期特徵設為固定週期之情況下,最後一或二週之消費金額可能會將此用戶資料預測為負例。為解決此問題,短期的期間可採遞增、遞減或其他變化等方式設定,例如,週期不必然連續,如可以特定時間點為錨點觀察自該錨點開始之一窗口內的期間,例如,以每月第一日為錨點觀察往後一週內之數值,即設定一窗口為七日,將該第一週設為第一週期,並將第一週至第四週設為第二週期,或甚至忽略第一週至第四週而僅將每月第一週設為短期特徵之週期;或者,將窗口設為每週一至每週三而忽略每週四至每週日,以將每週一至三設為短期特徵之週期等。或者,週期可為以每上一個週期增加一日的期間遞增,如週期P的序列中,P1=1日、P2=2日、P3=3日,P4=4日…Pn=n日等。或者,短期特徵之週期可採梯度方式增加,如註冊會員後一個月內,短期期間為每週,註冊會員後二至六個月內,短期期間為每雙週等。週期亦可重疊,如進行窗口滑動以判定短期期間,如窗口設為3日時,9月1日至9月3日為一短期期間,9月2日至9月4日為一短期期間,9月3日至9月5日為一短期其間等。從上開敘述可得知,短期特徵係相對於長期特徵而言,因此長期特徵亦可能為非固定或與短期特徵有相應變化,以該梯度方式遞增週期為例,註冊會員後一個月內,長期特徵可為具有一個月之期間,而註冊會員後二至六個月內,長期特徵可為具有六個月之期間。
The 
       接著,在第二分群模型建立完畢後,於步驟S34中,模型部署模組153部署第二分群模型,並且使用第二分群模型根據用戶的用戶短期
用戶特徵預測用戶是否屬於第二分群模型的價值用戶群。簡單地說,在部署完第二分群模型後,每當用戶登入後,預測裝置15便能夠不用等到目標時間(如以長期特徵的長期期間)到達,便可以知悉用戶是否屬於第二分群模型的價值用戶群。附帶一提的是,使用第二分群模型根據用戶的用戶短期用戶特徵預測用戶是否屬於第二分群模型的價值用戶群的做法可以是透過特徵比對,例如類神經網路的分類或計算餘弦相似性。
Next, after the second grouping model is established, in step S34, the model deployment module 153 deploys the second grouping model, and uses the second grouping model according to the user's user short-term
User characteristics predict whether the user belongs to the valuable user group of the second grouping model. Simply put, after the second grouping model is deployed, every time the user logs in, the 
       之後,在對用戶依據其用戶短期特徵預測其是否屬於價值用戶群之後,在步驟S35中,行銷模組154根據用戶是否屬於第二分群模型的價值用戶群的預測結果,決定對用戶的行銷策略。例如,若根據用戶一週的用戶資料擷取出短期特徵,並根據用戶的用戶短期特徵預測用戶為高價值用戶群,則可以直接發送準高價值用戶或高價值用戶的折價卷或促銷活動給用戶,以藉此增加交易或訂閱的成功率。附帶一提的是,預測用戶為價值用戶群的預測結果可以是以機率的方式呈現,例如,50%、70%或90%,因此,行銷模組154可以根據預測結果的機率值給予不同優惠資訊給用戶。
After that, after predicting whether the user belongs to the value user group according to the short-term characteristics of the user, in step S35, the 
舉例來說,多個用戶的多個用戶短期徵分別為「ID:0001;每週登入頻率=2;平均停留時間=15;每週消費金額=800」、「ID:0002;每週登入頻率=0.01;平均停留時間=7;每週消費金額=600」、「ID:0003;每週登入頻率=1;平均停留時間=50;每週消費金額=50,000」、「ID:0004;每週登入頻率=3;平均停留時間=25;每週消費金額=1,300」與、「ID:0005;每週登入頻率=10;平均停留時間=50;每週消費金額=15,300」,以及第二分群模型的用戶短期特徵為「每週登入頻率>2」、「平均停留時間>10」與「每週消費金額>1,000」,則可以預測「ID:0004」與「ID:0005」的用戶屬於第二分群模型的價值用戶群。 For example, multiple user short-term signatures of multiple users are "ID: 0001; weekly login frequency=2; average stay time=15; weekly consumption amount=800", "ID: 0002; weekly login frequency =0.01; Average length of stay=7; Weekly spend=600", "ID: 0003; Weekly login frequency=1; Average length of stay=50; Weekly spend=50,000", "ID: 0004; Weekly Login Frequency=3; Average Stay Time=25; Weekly Spend=1,300" and "ID: 0005; Weekly Login Frequency=10; Average Stay Time=50; Weekly Spend=15,300", and the second point The short-term characteristics of users in the group model are "weekly login frequency>2", "average stay time>10" and "weekly consumption amount>1,000", it can be predicted that users with "ID: 0004" and "ID: 0005" belong to The value user group of the second grouping model.
在此請注意,上述第二分群模型可以是依短期特徵之週期更新,如依最短週期更新、依第二分群模型之所有特徵中具最短週期之特徵的週期更新,或者隨時、即時、或近乎即時地更新,如更新頻率可以是一個小時或兩個小時一次。例如,即使是因為舉辦一活動才使得用戶的消費金額變多,該活動可為節日性、單次性、臨時性、突發性、短期性活動等,則舉辦活動促使用戶增加消費的用戶短期特徵也可被當作第二分群模型的用戶短期特徵,即使用與活動相關的用戶短期特徵來增補原來的多個用戶短期特徵。在活動舉辦結束後,因為第二分群模型是隨時地更新,所以活動舉辦結束後的一段時間後,第二分群模型不會將舉辦活動促使用戶增加消費的用戶短期特徵作為其用戶短期特徵,即移除自多個短期用戶特徵中與活動相關的用戶短期特徵。另外一種做法是準備兩個第二分群模型,其中一個第二分群模型是不考慮舉辦活動相關的用戶短期特徵以做為常態使用的第二分群模型,另一個第二分群模型則是考慮舉辦活動相關的用戶短期特徵以做為活動舉辦使用的第二分群模型。例如,舉辦活動是促銷某類商品,則活動舉辦使用的第二分群模型可以將用戶之興趣與在每週(假設活動時間為一個月)舉辦活動期間所購買之金額的用戶短期特徵作為第二分群模型的用戶短期特徵。 Please note that the above-mentioned second clustering model can be updated according to the period of short-term features, such as updating according to the shortest period, according to the period of the characteristic with the shortest period among all the features of the second grouping model, or any time, real-time, Or update almost instantaneously, such as every hour or two. For example, even if the user's consumption amount increases because of an event, which can be festive, one-off, temporary, sudden, short-term, etc., the user who organizes the event to increase the user's consumption will be short-term. The features can also be regarded as the user short-term features of the second clustering model, ie, the user short-term features related to the activity are used to supplement the original multiple user short-term features. After the event is over, because the second grouping model is updated at any time, after a period of time after the event is over, the second grouping model will not take the short-term user characteristics that the event promotes users to increase consumption as its short-term user characteristics , that is, the user short-term features related to the activity are removed from the multiple short-term user features. Another approach is to prepare two second grouping models. One of the second grouping models is a second grouping model that does not consider the short-term characteristics of users related to holding activities for normal use, and the other is a second grouping model. It is a second grouping model that considers the short-term characteristics of users related to holding activities as a second grouping model for holding activities. For example, if the event is to promote a certain type of product, the second grouping model used for the event can use the short-term characteristics of the user's interest and the amount purchased during the event every week (assuming the event time is one month) as the first grouping model. User short-term characteristics of a dichotomous model.
另外,前述用戶資料的來源與類型不以前述內容為限制。用戶資料的原始資料更可以包括透過物聯網或其他裝置感測到並傳送到預測裝置。例如,用戶到賣場停車,停車繳費機會記錄用戶的停車時間與次數等,第二分群模型可以將用戶每週的停車次數與來店消費次數的比率作為其用戶短期特徵。除此之外,同一個用戶的用戶資料的資料來源可是一個以上。例如,透過不同分店之實體連鎖店的POS獲取的用戶資料以獲得用戶短期特徵。又例如,用戶因為調職,故不在原先工作地點的連鎖咖啡店 消費,而改成在工作地點的連鎖咖啡店消費,故將兩個資料來源的用戶資料做統整後,才能正確地獲得用戶的用戶短期特徵,此亦示意了非時間特徵與短期特徵的整合應用。 In addition, the source and type of the aforementioned user data are not limited by the aforementioned content. The raw data of the user data may also include sensing and transmission to the prediction device through the Internet of Things or other devices. For example, when a user parks in a store, the parking payment opportunity records the user's parking time and number of times, etc. The second clustering model can take the ratio of the user's weekly parking times to the number of visits to the store as its short-term user characteristics. In addition, there may be more than one data source for the user data of the same user. For example, user data obtained through the POS of physical chain stores in different branches to obtain short-term user characteristics. Another example is a chain coffee shop where the user is not at the original workplace because of the transfer. Therefore, the short-term user characteristics of users can be obtained correctly after the user data from the two data sources are integrated, which also indicates the integration of non-temporal characteristics and short-term characteristics. application.
       請接著參照圖4,圖4是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法或裝置建立的RFM預測分群與當前RFM分群的示意圖。RFM預測分群可指針對RFM模型分群結果之各價值客戶群以前述短期特徵方式預測符合各價值客戶群之客戶。依據目標時間的用戶資料進行RFM分群後的當前RFM分群結果41中的高價值用戶群411與依據本新型實施例預測方法或裝置預測的RFM分群結果42的高價值用戶群421可能會有不同的用戶,可以將預測的RFM分群結果42的高價值用戶群421的用戶視為高價值或準高價值的用戶,直接進行行銷。也就是說,可以直接將高價值用戶群421用戶加入到高價值用戶群411,以優化當前RFM分群結果41,盡可能找出高價值的潛在用戶。簡單地說,本新型實施例的預測方法更包括將預測的價值用戶群與當前的價值用戶群進行合併的步驟。另外,高價值用戶群411、421之間的用戶變動也可以用於預測目前的行銷策略是否正確,並說明如下。
Please refer to FIG. 4 . FIG. 4 is a schematic diagram of the RFM prediction grouping and the current RFM grouping established by the method or apparatus for predicting value user grouping for users based on short-term characteristics of users according to the novel embodiment of the present invention. The RFM prediction grouping can refer to predicting the customers who meet each value customer group in the aforementioned short-term characteristic method for each value customer group based on the grouping result of the RFM model. The high-
       請同時參照圖5A與圖5B,圖5A是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法或裝置於第一時間獲得之價值用戶群及當前的價值用戶群的示意圖,以及圖5B是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法或裝置於第二時間獲得之價值用戶群及當前的價值用戶群的示意圖。於第一時間,預測的價值用戶群52及當前的價值用戶群51之間的交集531,但在第二時間,預測的價
值用戶群52及當前的價值用戶群51之間的交集532相較於交集531卻變小,亦即交集之用戶數量變少,而對交集之用戶及交集之變化可對其中之用戶作統計分析,以了解原客戶(位於交集531中之價值用戶群51)可能不再符合價值用戶群51或流失之情況,因此,可以進一步了解針對價值客戶群51的行銷策略是否可能出錯或需要調整,才導致價值用戶群的用戶流失,故可判定是否需調整行銷策略。
Please refer to FIG. 5A and FIG. 5B at the same time. FIG. 5A is a schematic diagram of the value user group and the current value user group obtained at the first time by the method for predicting the value user group or the device according to the short-term characteristics of the user according to the new embodiment of the present invention, 5B is a schematic diagram of the value user group and the current value user group obtained at a second time by the method or device for predicting value user groups based on short-term characteristics of users according to the novel embodiment of the present invention. At the first time, the 
       請同時參照圖5A與圖5C,圖5C是本新型實施例之以用戶短期特徵對用戶進行價值用戶分群的預測方法或裝置於第二時間獲得之價值用戶群及當前的價值用戶群的另一示意圖。於第一時間,預測的價值用戶群52及當前的價值用戶群51之間的交集531,但在第二時間,預測的價值用戶群52及當前的價值用戶群51之間的交集533卻變大,因此,可以知悉行銷策略可能有產生效果,故可以維持原行銷策略。某些實施例中,預測裝置15亦可包含或電性連接一顯示模組(未繪示),該顯示模組可根據用戶群51及52之資料將類似於圖5A-5C之用戶群圖形化方式呈現於螢幕上,以供監控預測的價值用戶群52之間的變化或當前的價值用戶群51與預測的價值用戶群52之間的變化。簡單地說,本新型的預測裝置以用戶短期特徵對用戶進行價值用戶分群的預測的價值用戶群與當前的價值用戶群在不同時間點的交集變化可以拿來被監控,以判斷行銷策策略是否得當,亦即本新型實施例的預測方法更包括此處所講的監控交集變化以判斷行銷策策略是否正確的步驟。
Please refer to FIG. 5A and FIG. 5C at the same time. FIG. 5C is another example of the value user group and the current value user group obtained at a second time by the method or device for predicting value user groups based on short-term characteristics of users according to the new embodiment of the present invention. Schematic. At the first time, the 
在某些實施例中,本新型可藉由一電腦軟體程式執行本揭露之模組或方法,其中所述電腦軟體程式可儲存於非揮發性的儲存媒介,例 如快閃記憶體裝置、磁性儲存裝置、光碟或其他類型的儲存媒介。例如,本新型之裝置可藉由一電腦軟體程式利用一機器學習演算法對第一用戶資料集進行擬合,以選定一用戶短期特徵集。該用戶短期特徵集可包含至少一短期特徵。該第一用戶資料集可包含具第一週期性之用戶長期特徵集。該用戶長期特徵集可包含長期特徵資料。該電腦軟體程式可對用戶資料集進行特徵工程,以選定該用戶短期特徵集。該特徵工程可包含特徵重要性分析,以作為選定該用戶短期特徵集之依據。該特徵工程可包含短期特徵萃取,以自該用戶長期特徵集中萃取出該用戶短期特徵集。該用戶短期特徵集可包含具第二週期性之用戶短期特徵資料。該第一週期性可包含第一週期,該第二週期性可包含第二週期;對於同一特徵而言,該第二週期可短於該第一週期。該電腦軟體程式可依據該用戶短期特徵集對該第一用戶資料集進行處理,以產生第二用戶資料集。對該第一用戶資料集進行處理可包含對該第一用戶資料集刪除該用戶短期特徵集之外之特徵。該電腦軟體程式可利用該機器學習演算法對該第二用戶資料集進行擬合,以產生一用戶分群預測模型。該電腦軟體程式可利用該用戶分群預測模型對第三用戶資料集進行預測,以產生第一預測結果。該第三用戶資料集可包含至少某一特定用戶之用戶資料。該第三用戶資料集可包含該用戶短期特徵集。該第一預測結果除了包括某一特定用戶是否屬於特定族群的預測結果外,更可包含第四用戶資料集。該第四用戶資料集可包含以該用戶分群預測模型預測為正例之用戶資料。該電腦軟體程式可對該第一用戶資料集與該第四用戶資料集進行聯集分析,以取得第一聯集分析結果。該電腦軟體程式可對該第一用戶資料集與該第四用戶資料集進行資料合併,以取得第五用 戶資料集。於產生該第一預測結果之後,該電腦軟體程式可利用該用戶分群預測模型對第六用戶資料集進行預測,以產生第二預測結果。該第六用戶資料集可包含至少某一特定用戶之用戶資料。該第六用戶資料集可包含該用戶短期特徵集。該第二預測結果除了包括某一特定用戶是否屬於特定族群的預測結果外,更可包含第七用戶資料集。該第七用戶資料集可包含以該用戶分群預測模型預測為正例之用戶資料。該電腦軟體程式可對該第一用戶資料集與該第七用戶資料集進行資料合併,以取得第八用戶資料集。該電腦軟體程式可對該第一用戶資料集與該第七用戶資料集進行聯集分析,以取得第二聯集分析結果。該電腦軟體程式可對該第一聯集分析結果與該第二聯集分析結果進行交集比對,以產生一交集比對結果,其中該交集比對結果可以更真實地反映某一個特定族群的用戶短期特徵,該電腦軟體程式可以根據該交集比對結果以建立出用於判斷或預測用戶是否屬於該特定族群的分類模型。 In some embodiments, the present invention can implement the modules or methods of the present disclosure by a computer software program, wherein the computer software program can be stored in a non-volatile storage medium, such as Such as flash memory devices, magnetic storage devices, optical discs or other types of storage media. For example, the device of the present invention can use a computer software program to fit a first user data set using a machine learning algorithm to select a user short-term feature set. The user short-term feature set may include at least one short-term feature. The first set of user data may include a set of long-term user characteristics with a first periodicity. The user long-term feature set may contain long-term feature data. The computer software program can perform feature engineering on the user data set to select the user short-term feature set. The feature engineering may include feature importance analysis as a basis for selecting the short-term feature set of the user. The feature engineering may include short-term feature extraction to extract the user's short-term feature set from the user's long-term feature set. The user short-term feature set may include user short-term feature data with a second periodicity. The first periodicity may include a first period, and the second periodicity may include a second period; for the same feature, the second period may be shorter than the first period. The computer software program can process the first user data set according to the user short-term feature set to generate a second user data set. Processing the first set of user data may include deleting features from the first set of user data that are outside the set of short-term features of the user. The computer software program can use the machine learning algorithm to fit the second user data set to generate a user group prediction model. The computer software program can use the user group prediction model to predict the third user data set to generate a first prediction result. The third set of user data may include user data of at least one specific user. The third set of user data may include the set of short-term characteristics of the user. In addition to the prediction result of whether a specific user belongs to a specific group, the first prediction result may further include a fourth user data set. The fourth user data set may include user data predicted by the user group prediction model as positive examples. The computer software program can perform union analysis on the first user data set and the fourth user data set to obtain a first union analysis result. The computer software program can perform data merging on the first user data set and the fourth user data set to obtain a fifth user data set Account data set. After generating the first prediction result, the computer software program may use the user group prediction model to predict the sixth user data set to generate a second prediction result. The sixth user data set may include user data of at least one specific user. The sixth set of user profiles may include the set of short-term characteristics of the user. In addition to the prediction result of whether a specific user belongs to a specific group, the second prediction result may further include a seventh user data set. The seventh user data set may include user data predicted by the user group prediction model as positive examples. The computer software program can perform data merging on the first user data set and the seventh user data set to obtain an eighth user data set. The computer software program can perform a union analysis on the first user data set and the seventh user data set to obtain a second union analysis result. The computer software program can perform an intersection comparison between the first union analysis result and the second union analysis result to generate an intersection comparison result, wherein the intersection comparison result can more truly reflect the situation of a particular group. User short-term characteristics, the computer software program can establish a classification model for judging or predicting whether the user belongs to the specific group according to the intersection comparison result.
某些實施例中,本新型係包含一可用於用戶資料處理之電腦程式產品,其中所述電腦軟體程式可儲存於非揮發性的儲存媒介,例如快閃記憶體裝置、磁性儲存裝置、光碟或其他類型的儲存媒介,且所述電腦軟體程式經由電腦載入後用以預測並擴充用戶資料集,係執行包含以下之步驟:對第一用戶資料集進行特徵工程,使該第一用戶資料集包含具有一用戶長期特徵集之用戶資料,該用戶長期特徵集包含具第一週期性之特徵,該第一用戶資料集包含第一用戶數量之用戶資料;利用一機器模型演算法對該第一用戶資料集進行擬合,以產生一用戶短期特徵集,該用戶短期特徵集包含具第二週期性之特徵,其中該用戶長期特徵集之第一週期性具有 第一週期,該用戶短期特徵集之第二週期性具有第二週期,該第一週期長於該第二週期;依據該用戶短期特徵集對該第一用戶資料集進行特徵選定,以產生第二用戶資料集,使該第二用戶資料集包含具有該用戶短期特徵之用戶資料,利用該機器學習演算法對該第二用戶資料集進行擬合,以產生一用戶群集預測模型;利用該用戶群集預測模型對第三用戶資料集進行預測,以產生第一用戶群集預測結果,該第一用戶群集預測結果除了包括某一特定用戶是否屬於特定族群的預測結果外,更包含第四用戶資料集;及對該第一用戶資料集與該第四用戶資料集進行資料合併,以產生第五用戶資料集,該第五用戶資料集包含第二用戶數量之用戶資料,該第五用戶資料集之第二用戶數量高於該第一用戶資料集之第一用戶數量。 In some embodiments, the present invention includes a computer program product that can be used for user data processing, wherein the computer software program can be stored in a non-volatile storage medium, such as a flash memory device, a magnetic storage device, an optical disk or a Other types of storage media, and the computer software program is loaded by the computer to predict and expand the user data set, and the steps include the following: performing feature engineering on the first user data set to make the first user data set. Including user data having a long-term user feature set, the user long-term feature set including features with a first periodicity, the first user data set including user data of a first number of users; using a machine model algorithm for the first user data set The user data set is fitted to generate a user short-term feature set, the user short-term feature set includes features with a second periodicity, wherein the first periodicity of the user long-term feature set has The first period, the second period of the user short-term feature set has a second period, and the first period is longer than the second period; the first user data set is feature selected according to the user short-term feature set to generate a second period a user data set, so that the second user data set contains user data with the short-term characteristics of the user, and the machine learning algorithm is used to fit the second user data set to generate a user cluster prediction model; using the user cluster The prediction model predicts the third user data set to generate a first user cluster prediction result, and the first user cluster prediction result includes a fourth user data set in addition to the prediction result of whether a specific user belongs to a specific ethnic group; and performing data merging on the first user data set and the fourth user data set to generate a fifth user data set, the fifth user data set includes the user data of the second number of users, and the first user data set in the fifth user data set 2. The number of users is higher than the number of first users in the first user data set.
在某些實施例中,上述用戶資料集可包含複數個不同用戶之資料,上述特徵資料集可包含複數個不同資料屬性之特徵欄位,各該特徵欄位包含複數個特徵值,各該特徵欄位之複數個特徵值可構成一特徵值範圍。該電腦軟體程式可依據各該特徵欄位之特徵值範圍計算或選取一特徵門檻值。前述特徵資料集之資料可包含特徵門檻值。各該特徵門檻值可為各該特徵欄位之特徵值範圍中之最大值、最小值、平均值、中位數、眾數、百分位數、四分位數、標準差等。依據不同特徵值的屬性,該電腦軟體程式可利用不同的特徵門檻值判斷規則來決定特徵門檻值。例如,該電腦軟體程式可判斷特徵值與該機器學習演算法所擬合之目標變數或標籤的關聯性,若判斷該關聯性為正相關,可計算或選取特徵值範圍中之最小值作為特徵門檻值,而若判斷該關聯性為負相關,可計算或選取特徵值範圍中之最大值作為特徵門檻值。例如,若該機器學習演算法之目標變數為消費金 額,特徵為每週登入頻率,特徵值範圍之最小值為2,該電腦軟體程式可判斷消費金額與每週登入頻率具有正相關之關聯性,可選取該最小值作為特徵門檻值。在某些例子中,若目標變數與特徵之間具有弱關聯性,該電腦軟體程式可計算或選取特徵值範圍之算術平均數作為特徵門檻值。在某些例子中,若特徵為計數屬性,該電腦軟體程式可計算或選取特徵值範圍中之眾數作為特徵門檻值。在某些例子中,該電腦軟體程式可設定一門檻值系數h=0,0.5,1,1.8,2,...並計算特徵值範圍之統計數據與該門檻值系數之乘積作為特徵門檻值;例如,若設定h=0.5,前述每週登入頻率之特徵門檻值便為1。在某些例子中,該電腦軟體程式可接收一特徵門檻值之設定指令,以依據該設定指令設定特徵門檻值。 In some embodiments, the user data set may include data of a plurality of different users, the feature data set may include a plurality of feature fields of different data attributes, each of the feature fields may include a plurality of feature values, each of the feature A plurality of eigenvalues of a field can constitute a eigenvalue range. The computer software program can calculate or select a characteristic threshold value according to the characteristic value range of each characteristic field. The data of the aforementioned feature data set may include feature threshold values. Each of the characteristic thresholds can be the maximum value, the minimum value, the mean value, the median, the mode, the percentile, the quartile, the standard deviation, and the like in the characteristic value range of each of the characteristic fields. According to the attributes of different eigenvalues, the computer software program can use different judging rules for the eigenvalues to determine the eigenvalues. For example, the computer software program can determine the correlation between the eigenvalue and the target variable or label fitted by the machine learning algorithm. If the correlation is determined to be a positive correlation, it can calculate or select the minimum value in the range of eigenvalues as the feature Threshold value, and if it is judged that the correlation is negative correlation, the maximum value in the range of eigenvalues can be calculated or selected as the characteristic threshold value. For example, if the target variable of the machine learning algorithm is consumption money Amount, the characteristic is the weekly login frequency, and the minimum value of the characteristic value range is 2. The computer software program can determine that the consumption amount has a positive correlation with the weekly login frequency, and can select the minimum value as the characteristic threshold value. In some instances, if there is a weak correlation between the target variable and the feature, the computer software program may calculate or select the arithmetic mean of the range of feature values as the feature threshold. In some instances, if the feature is a count attribute, the computer software program may calculate or select the mode of the feature value range as the feature threshold value. In some examples, the computer software program may set a threshold coefficient h=0,0.5,1,1.8,2,... and calculate the product of the eigenvalue range statistics and the threshold value coefficient as the eigenthreshold value ; For example, if h=0.5 is set, the characteristic threshold of the aforementioned weekly login frequency is 1. In some examples, the computer software program can receive a characteristic threshold setting command to set the characteristic threshold according to the setting command.
本新型之功效在於可以透過用戶的用戶短期特徵,如每日或單次消費行為,對用戶進行用戶價值評分預測(即,預測用戶是否屬於某一個價值用戶群),故可針對到用戶個人,讓用戶對現行服務的反應更加即時全面,無須等到最終消費結果產生才反應。進一步地,透過本新型可以有效地維持高價值用戶,以及將準高價值用戶趁早變成高價值用戶,甚至可以彌補既有客戶分群如現有RFM分群的不足,將準高價值用戶視為高價值用戶(即,合併預測高價值用戶群與當前分群的高價值用戶群),以提升交易與訂閱的金額跟成功率。 The function of the new model is that it can predict the user's value score (that is, predict whether the user belongs to a certain value user group) through the user's short-term characteristics of the user, such as daily or single consumption behavior, so it can be aimed at individual users. Allow users to respond to the current service more instantly and comprehensively, without having to wait for the final consumption result to respond. Further, through this new model, high-value users can be effectively maintained, and quasi-high-value users can be turned into high-value users as soon as possible, and even the deficiencies of existing customer groups such as existing RFM groups can be compensated, and quasi-high-value users can be regarded as high-value users. (That is, combine the predicted high-value user group with the high-value user group of the current group) to increase the amount and success rate of transactions and subscriptions.
上述對「本新型」之描述、各實施例、變形例中的記載以及圖式中揭露的內容僅為用於說明請求項中記載的新型的一例,依據本新型精神亦可有包含不同特徵或實施態樣之各種實施組合,因此請求項中記載的新型不受上述對「本新型」之描述、實施例或圖式中揭露的內容所限定。本申 請最初的請求項中的記載僅僅是一個示例,可以根據說明書、圖式等的記載對請求項中的記載進行適宜的變更。 The above description of "the present invention", the descriptions in the various embodiments, modified examples and the contents disclosed in the drawings are only an example for describing the new model described in the claims, and may also include different features or Various implementation combinations of the implementation aspects, so the new type described in the claims is not limited by the content disclosed in the above description, embodiments or drawings of the "new type". this application The description in the first claim is just an example, and the description in the claim can be appropriately changed according to the description in the description, drawings, and the like.
1:價值用戶分群系統 1: Value user grouping system
11:平台裝置 11: Platform device
111:瀏覽行為 111: Browsing Behavior
112:興趣喜好 112: Interests
113:裝置資訊 113: Device Information
114:網際網路位址(IP)或裝置識別 114: Internet address (IP) or device identification
12:用戶管理系統 12: User management system
121:人口統計資料 121: Demographics
122:消費紀錄 122:Consumption record
123:帳務紀錄 123: Accounting records
124:互動紀錄 124: Interactive Record
13:公開資料來源裝置 13: Open Source Device
131:氣候地理 131: Climatic Geography
132:金融經濟 132: Financial Economics
133:人文人口 133: Humanities Population
134:醫療生物 134: Medical Biology
14:資料庫 14:Database
15:預測裝置 15: Prediction device
151:數據處理模組 151: Data processing module
152:機器學習模組 152: Machine Learning Modules
153:模型部署模組 153: Model Deployment Module
154:行銷模組 154: Marketing Module
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| TW110210571U TWM624658U (en) | 2021-09-07 | 2021-09-07 | Prediction devices for predicting whether users belong to valuable user groups based on short-term user characteristics | 
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| TW110210571U TWM624658U (en) | 2021-09-07 | 2021-09-07 | Prediction devices for predicting whether users belong to valuable user groups based on short-term user characteristics | 
Publications (1)
| Publication Number | Publication Date | 
|---|---|
| TWM624658U true TWM624658U (en) | 2022-03-21 | 
Family
ID=81748015
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date | 
|---|---|---|---|
| TW110210571U TWM624658U (en) | 2021-09-07 | 2021-09-07 | Prediction devices for predicting whether users belong to valuable user groups based on short-term user characteristics | 
Country Status (1)
| Country | Link | 
|---|---|
| TW (1) | TWM624658U (en) | 
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN115408436A (en) * | 2022-08-05 | 2022-11-29 | 中国民生银行股份有限公司 | Method and device for screening financial users | 
| TWI787127B (en) * | 2022-05-12 | 2022-12-11 | 智泓科技股份有限公司 | Marketing object decision-making method and system and computer program product | 
| TWI831462B (en) * | 2022-11-09 | 2024-02-01 | 元智大學 | Customer consumption behavior prediction system and customer consumption behavior prediction method | 
| TWI836867B (en) * | 2023-01-17 | 2024-03-21 | 第一商業銀行股份有限公司 | Customer financing methods and systems for fixed-term financial products | 
- 
        2021
        - 2021-09-07 TW TW110210571U patent/TWM624658U/en unknown
 
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| TWI787127B (en) * | 2022-05-12 | 2022-12-11 | 智泓科技股份有限公司 | Marketing object decision-making method and system and computer program product | 
| CN115408436A (en) * | 2022-08-05 | 2022-11-29 | 中国民生银行股份有限公司 | Method and device for screening financial users | 
| TWI831462B (en) * | 2022-11-09 | 2024-02-01 | 元智大學 | Customer consumption behavior prediction system and customer consumption behavior prediction method | 
| TWI836867B (en) * | 2023-01-17 | 2024-03-21 | 第一商業銀行股份有限公司 | Customer financing methods and systems for fixed-term financial products | 
Similar Documents
| Publication | Publication Date | Title | 
|---|---|---|
| US11893593B2 (en) | Sales prediction systems and methods | |
| Rossi et al. | The value of purchase history data in target marketing | |
| Jain et al. | Investigating household purchase timing decisions: A conditional hazard function approach | |
| US20220036391A1 (en) | Auto-segmentation | |
| TWM624658U (en) | Prediction devices for predicting whether users belong to valuable user groups based on short-term user characteristics | |
| US20190019213A1 (en) | Predicting the effectiveness of a marketing campaign prior to deployment | |
| US20140279208A1 (en) | Electronic shopping system and service | |
| US20150006286A1 (en) | Targeting users based on categorical content interactions | |
| US20150006294A1 (en) | Targeting rules based on previous recommendations | |
| EP2474945A1 (en) | Analyzing transactional data | |
| WO2011005072A2 (en) | Personalized shopping list recommendation based on shopping behavior | |
| US20130046619A1 (en) | System and method for targeted advertising | |
| US20130204823A1 (en) | Tools and methods for determining relationship values | |
| CN113781139A (en) | Item recommendation method, item recommendation device, equipment and medium | |
| KR20230142975A (en) | Online and offline integrated customer behavior-based personalized marketing system | |
| Zheng et al. | A scalable purchase intention prediction system using extreme gradient boosting machines with browsing content entropy | |
| Keramati et al. | Investigating factors affecting customer churn in electronic banking and developing solutions for retention | |
| CN119027165A (en) | Commodity marketing method and system based on artificial intelligence | |
| Kunekar et al. | Enhancing advertising initiatives: using machine learning algorithms to engage targeted customer | |
| WO2001033410A2 (en) | Segment-based self-learning method and system | |
| TW202312060A (en) | Prediction devices and methods for predicting whether users belong to valuable user groups based on short-term user characteristics, and storage media for storing the methods | |
| Zhang et al. | Modeling customers' loyalty using ten years' automobile repair and maintenance data: Machine learning approaches | |
| CN115760347A (en) | Bank product management system | |
| TW202312061A (en) | User management devices and methods capable of identifying anonymous users belonging to specific user groups, and storing media storing the methods | |
| Nimma et al. | Optimizing Mobile Advertising with Reinforcement Learning and Deep Neural Networks |