{"created":"2021-03-01T06:19:55.492180+00:00","id":12932,"links":{},"metadata":{"_buckets":{"deposit":"7fb3fcfa-b973-42e6-83b6-70bcc0fa6c11"},"_deposit":{"id":"12932","owners":[],"pid":{"revision_id":0,"type":"depid","value":"12932"},"status":"published"},"_oai":{"id":"oai:nagoya.repo.nii.ac.jp:00012932","sets":["312:651:734"]},"author_link":["40782","40783"],"item_12_biblio_info_6":{"attribute_name":"書誌情報","attribute_value_mlt":[{"bibliographicIssueDates":{"bibliographicIssueDate":"2011-03-25","bibliographicIssueDateType":"Issued"}}]},"item_12_date_granted_64":{"attribute_name":"学位授与年月日","attribute_value_mlt":[{"subitem_dategranted":"2011-03-25"}]},"item_12_degree_grantor_62":{"attribute_name":"学位授与機関","attribute_value_mlt":[{"subitem_degreegrantor":[{"subitem_degreegrantor_language":"ja","subitem_degreegrantor_name":"名古屋大学"},{"subitem_degreegrantor_language":"en","subitem_degreegrantor_name":"Nagoya University"}],"subitem_degreegrantor_identifier":[{"subitem_degreegrantor_identifier_name":"13901","subitem_degreegrantor_identifier_scheme":"kakenhi"}]}]},"item_12_degree_name_61":{"attribute_name":"学位名","attribute_value_mlt":[{"subitem_degreename":"博士(情報科学)","subitem_degreename_language":"ja"}]},"item_12_description_4":{"attribute_name":"抄録","attribute_value_mlt":[{"subitem_description":"Recently, data stream mining techniques become more and more important in many applications. For example, intrusion detection on the network flow data, outlier detection of sensor network data, and usage analysis on telecommunication data. In these applications, we utilize data stream mining algorithms to discover up-to-date patterns or associations hidden inside the continuous data. Beside the researches on single data source, in the era of information overload, it is also meaningful to mine correlations among cross-domain data sources in order to support people’s decision making. For example, automatic analysis of news articles concerning to the financial market is helpful to generate profitable action signals (buy or sell stocks) accurately. In this dissertation, we aim to discover interesting correlations among multiple evolving data streams. In terms of different streaming data sources, firstly, we categorize the correlations in the streaming data into two basic correlations: discrete correlation and continuous correlation. The discrete correlation corresponds to the applications assuming that the data samples are independent with each other. For example, in the market basket data, we assume that the records of customers’ purchase are independent, thus the correlations among attributes (products) are discrete. Existing techniques of frequent itemset mining regard the frequently co-occurred sets of attributes as highly correlated attributes. However, in some cases, we may miss important patterns. For example, we may be interested in the knowledge of “what are the symptoms of the new and rare illness?” from medical records data, although the occurrence of the illness is rare. Correlation based association rules mining provides good solution to this kind of problem. However, corresponding algorithms of mining correlated patterns in static datasets exist, but no work has been done so as to complete the same task for data streams. On the other hand, in some applications (i.e., sensor network, stock market), data samples in the whole collection of time series are correlated with each other in the alignment of time. In this case, we define the cross-relationship among the multiple continuous time-series data streams as continuous correlation. Existing researches calculated the correlations between streams, and reported highly correlated pairs of streams. However, none of these algorithms manages to compactly and adaptively describe the key trends among the whole collection of streaming time series, although streams often are inherent correlated. The key trends are useful to reduce the massive numerical streams into just a handful of variables. Besides, several works are reported on applying clustering techniques to multiple data streams for discovering cross-relationships. However, the existence of data evolution in data streams leads to another important issue of supporting various clustering requirements at the same time, instead of the existing works on periodical way of checking cluster evolutions. Consequently, this dissertation extends the study to mine complex correlations in cross-domain data sources by combining the investigations of these two kinds of basic correlations. Hence, in this dissertation, we do further explorations of correlation discovery in different applications of data streams. The key research challenges that arise in this dissertation include: (I) correlated patterns mining in streaming transaction data; (II) adaptive and flexible correlation mining among massive continuous time series; and (III) cross-domain correlation analysis among multiple sources of data streams. This dissertation makes a number of contributions toward the solutions of these tasks, including the following algorithms: • Quantifiable Correlated Patterns Mining: This method achieves to mine correlated attributes from streaming transaction data. Additionally, in the applications of quantitative data, we also discover frequent ratio associations among the highly correlated attributes. To the best of our knowledge, this work is the first study achieving to mine both of correlations and ratio associations in streaming quantitative transaction data with only single scan of data and limited memory. • Correlated-Clusters Mining: This algorithm reduces massive evolving streaming time-series data into just a handful of hidden variables, which summarize the key trends of massive evolving time-series data streams automatically, incrementally and adaptively. We prove that the discovered hidden variables can be used to detect concept drifts immediately, and do efficient forecasting in sensor network. • Flexible Timeline Clustering: A framework is proposed to support various clustering requirements at any time during the whole collection of streaming time-series data. In the requests of clustering, the user specifies arbitrary interested time periods. An incremental time-series approximation method and statistic maintenance hierarchical structures are proposed to satisfy the demands of efficient retrieval with high accuracy. • Dynamic Prediction of Stock Prices Based on Analysis of News Articles: As an example of correlation analysis among cross-domain data sources, we realize automatic analysis of the correlation between online news articles and stock prices. This work classifies the news articles into good news which are followed by a moving up trend in the company’s stock market or bad news, reversely. In this problem, classification is the process of mining we defined discrete correlation, for the reason that we treat the collection of news articles as transaction data consisting of words, and the news articles are independent with each other. In order to improve the accuracy of prediction, we also take account of continuous correlations in this problem. On one hand, in the generation of news articles for learning, we abstract trends of stock prices, and then label the news articles according to corresponding trends in stock prices; on the other hand, we propose dynamic mechanism of choosing sliding windows to identify trends of stock prices according to the contents of consecutive news articles, taking account of the case that significant topics in consecutive news articles may influence the stock market sensitively. Extensive experiments on both synthetic and real-life data demonstrate that our work is effective and practical. Furthermore, as the trial of investigating correlations between news articles and stock prices, the proposed correlation mining techniques can be used as the bases of another intelligent data analysis goal, information integration.","subitem_description_language":"en","subitem_description_type":"Abstract"}]},"item_12_description_5":{"attribute_name":"内容記述","attribute_value_mlt":[{"subitem_description":"名古屋大学博士学位論文 学位の種類 : 博士(情報科学)(課程) 学位授与年月日:平成23年3月25日","subitem_description_language":"ja","subitem_description_type":"Other"}]},"item_12_dissertation_number_65":{"attribute_name":"学位授与番号","attribute_value_mlt":[{"subitem_dissertationnumber":"甲第9273号"}]},"item_12_identifier_60":{"attribute_name":"URI","attribute_value_mlt":[{"subitem_identifier_type":"HDL","subitem_identifier_uri":"http://hdl.handle.net/2237/14822"}]},"item_12_select_15":{"attribute_name":"著者版フラグ","attribute_value_mlt":[{"subitem_select_item":"publisher"}]},"item_12_text_63":{"attribute_name":"学位授与年度","attribute_value_mlt":[{"subitem_text_value":"2010"}]},"item_access_right":{"attribute_name":"アクセス権","attribute_value_mlt":[{"subitem_access_right":"open access","subitem_access_right_uri":"http://purl.org/coar/access_right/c_abf2"}]},"item_creator":{"attribute_name":"著者","attribute_type":"creator","attribute_value_mlt":[{"creatorNames":[{"creatorName":"FAN, Wei","creatorNameLang":"en"}],"nameIdentifiers":[{"nameIdentifier":"40782","nameIdentifierScheme":"WEKO"}]},{"creatorNames":[{"creatorName":"范, 薇","creatorNameLang":"ja"}],"nameIdentifiers":[{"nameIdentifier":"40783","nameIdentifierScheme":"WEKO"}]}]},"item_files":{"attribute_name":"ファイル情報","attribute_type":"file","attribute_value_mlt":[{"accessrole":"open_date","date":[{"dateType":"Available","dateValue":"2018-02-20"}],"displaytype":"detail","filename":"k9273.pdf","filesize":[{"value":"2.2 MB"}],"format":"application/pdf","licensetype":"license_note","mimetype":"application/pdf","url":{"label":"k9273.pdf","objectType":"fulltext","url":"https://nagoya.repo.nii.ac.jp/record/12932/files/k9273.pdf"},"version_id":"0adf2e04-1c6a-45db-9576-f756df5cd6d0"}]},"item_language":{"attribute_name":"言語","attribute_value_mlt":[{"subitem_language":"eng"}]},"item_resource_type":{"attribute_name":"資源タイプ","attribute_value_mlt":[{"resourcetype":"doctoral thesis","resourceuri":"http://purl.org/coar/resource_type/c_db06"}]},"item_title":"A Study on Knowledge Discovery among Multiple Evolving Data Streams","item_titles":{"attribute_name":"タイトル","attribute_value_mlt":[{"subitem_title":"A Study on Knowledge Discovery among Multiple Evolving Data Streams","subitem_title_language":"en"}]},"item_type_id":"12","owner":"1","path":["734"],"pubdate":{"attribute_name":"PubDate","attribute_value":"2011-05-19"},"publish_date":"2011-05-19","publish_status":"0","recid":"12932","relation_version_is_last":true,"title":["A Study on Knowledge Discovery among Multiple Evolving Data Streams"],"weko_creator_id":"1","weko_shared_id":-1},"updated":"2023-01-16T05:00:22.977025+00:00"}