Advances in Data Mining and Modeling: Hong Kong 27 - 28 June by Wai-Ki Ching, Michael Kwok-Po Ng

By Wai-Ki Ching, Michael Kwok-Po Ng

Facts mining and knowledge modelling are below speedy improvement. as a result of their huge purposes and examine contents, many practitioners and teachers are drawn to paintings in those parts. with a purpose to selling verbal exchange and collaboration one of the practitioners and researchers in Hong Kong, a workshop on info mining and modelling was once held in June 2002. Prof Ngaiming Mok, Director of the Institute of Mathematical examine, The college of Hong Kong, and Prof Tze Leung Lai (Stanford University), C.V. Starr Professor of the collage of Hong Kong, initiated the workshop. This paintings includes chosen papers provided on the workshop. The papers fall into major different types: info mining and knowledge modelling. facts mining papers take care of trend discovery, clustering algorithms, category and sensible purposes within the inventory industry. info modelling papers deal with neural community types, time sequence versions, statistical types and useful purposes.

Show description

Read or Download Advances in Data Mining and Modeling: Hong Kong 27 - 28 June 2002 PDF

Best science (general) books

Pediatric Infectious Diseases Revisited (Birkhauser Advances in Infectious Diseases)

Beginning with historic, epidemiological and sociocultural matters, this ebook provides medical and molecular organic facets of pediatric infectious illnesses. The textual content deals new insights into the pathogenesis of an infection, and updates on diagnostics, prevention and therapy of pediatric viral, fungal and bacterial ailments, in addition to rising new pathogens.

Oxygen Transport to Tissue XXIX

The thirty fourth Annual convention of the overseas Society on Oxygen shipping to Tissue (ISOTT) used to be held in the course of August 12–17, 2006 in Louisville, Kentucky, united states. The emphasis of ISOTT-2006 used to be on ‘‘Expanding our Horizon. ’’ when it comes to learn subject matters, we extra a few more recent ones – Translational experiences, Tissue Engineering, and Nanobiotechnology.

Cell Volume and Signaling

In entrance of you is the completed fabricated from your paintings, the textual content of your contributions to the 2003 Dayton foreign Symposium on mobile quantity and sign Transduction. As all of us keep in mind, this symposium introduced jointly the Doyens of mobile and Molecular body structure in addition to aspiring younger investigators and scholars during this box.

Additional info for Advances in Data Mining and Modeling: Hong Kong 27 - 28 June 2002

Sample text

In the final DDC model, we often drop certain clusters from a clustering. For example, some leaves in Figure 2 do not have class symbols. These clusters contain few objects in several classes. These are the objects, which are located in the boundaries of other clusters. From our experiment, we found that dropping these clusters from the model can increase the classification accuracy. Table 1 shows four public data sets taken froin the UCI machine learning data repositoryc, which were used to test the DCC models against some other classifiers.

A member of the k-means family) and F the Fastmap algorithm. We summarize the interactive process to build a cluster tree in Figure 3: After we build a cluster tree from the training data set using this process, we have created a sequence of clusterings. In principle, each clustering is a DCC model. Their classification performances are different. Therefore, we use a test data set (or the tuning data set) to identify the best DCC model from a cluster tree. We start from a top level clustering.

Selection of k sets of most effective features from a group of N features. This is a typical combination optimization problem. The total number of combinations is C i . The computation work is gigantic, and it is unrealistic to * The work is supported by the National Natural Science Foundation of China (60175004). 15 16 compare all possible combinations and select the optimized feature sets. In fact, it is impossible because there are thousands over thousands of genes. The simplest way is scoring of each gene based on a certain criterion and filtering away genes with the lowest scores and selecting genes with the highest scores by various mean, such as individual gene-based scoring method [6, 81, mutual information scoring method [9], Markov Blanket filtering method [lo], etc.

Download PDF sample

Rated 4.86 of 5 – based on 44 votes