JISCMail - SUPPORT-VECTOR-MACHINES Archives

Call for Chapters: Intelligent Multidimensional Data Clustering and Analysis

Editors
Dr. Siddhartha Bhattacharyya, RCC Institute of Information Technology, India
Dr. Sourav De, University Institute of Technology, India
Dr. Indrajit Pan, RCC Institute of Information Technology, India
Prof. (Dr.) Paramartha Dutta, Visva-Bharati University, India

Call for Chapters

Please click http://www.igi-global.com/publish/call-for-papers/call-details/1849 to submit a chapter proposal.

Proposals Submission Deadline: November 30, 2015
Full Chapters Due: January 30, 2016

Introduction
Commonly used as a preliminary data mining practice, data preprocessing transforms the data into a format that will be more easily and effectively processed for the purpose of the users. There are a number of data preprocessing techniques: data cleaning, data integration, data transformation and data reduction. The need to cluster large quantities of multi-dimensional data is widely recognized. Cluster analysis is used to identify homogeneous and well-separated groups of objects in databases. It plays an important role in many fields of business and science.
Existing clustering algorithms can be broadly classified into four types: partitioning, hierarchical, grid-based, and density-based algorithms. Partitioning algorithms start with an initial partition and then use an iterative control strategy to optimize the quality of the clustering results by moving objects from one group to another.
Hierarchical algorithms create a hierarchical decomposition of the given data set of data objects. Grid-based algorithms quantize the space into a finite number of grids and perform all operations on this quantized space.
Density-based approaches are designed to discover clusters of arbitrary shapes. These approaches hold that, for each point within a cluster, the neighborhood of a given radius must exceed a defined threshold. Each of the existing clustering algorithms has both advantages and disadvantages. The most common problem is rapid degeneration of performance with increasing dimensions, particularly with approaches originally designed for low-dimensional data. To solve the high-dimensional clustering problem, dimension reduction methods have been proposed which assume that clusters are located in a low-dimensional subspace. However, this assumption does not hold for many real-world data sets. The difficulty of high-dimensional clustering is primarily due to the following characteristics of
high-dimensional data:
High-dimensional data often contain a large amount of noise (outliers). The existence of noise results in clusters which are not well-separated and degrades the effectiveness of the clustering algorithms. Clusters in high-dimensional spaces are commonly of various densities. Grid-based or density-based algorithms therefore have difficulty choosing a proper cell size or neighborhood radius which can find all clusters. Clusters in high-dimensional spaces rarely have welldefined shapes, and some algorithms assume clusters of certain shapes. The effectiveness of grid-based approaches suffer when data points are clustered around a vertex of the grid and are separated in different cells.
To sum up, the classical techniques lack in one way or other as regards to faithful analysis and clustering of multidimensional data owing to inherent uncertainties in assumptions and heuristic choices. It is in this scenario that the soft computing paradigm can be effectively used to arrive at effective and productive throughputs.

Objective
To bring a broad spectrum of multidimensional data clustering and data analysis applications under the purview of hybrid intelligence so that it is able to trigger further inspiration among various research communities to contribute in their respective fields of applications thereby orienting these application fields towards intelligence. Once the purpose, as stated above, is achieved a larger number of research communities may be brought under one umbrella to ventilate their ideas in a more structured manner. In that case, the present endeavor may be seen as the beginning of such an effort in bringing various research applications close to one another. By academically coming closer to one another, research communities working in diversified application areas involving multidimensional data viz. true color images, videos, big data, would be more encouraged to form groups among themselves paving way for interdisciplinary research. Speaking from the scholastic view, this is a formidable achievement in which the present endeavor may be thought of as the maiden facilitator. It may however be noted that there are good amounts of contributions of the application of hybrid soft computing in various fields. However, any such previous effort has remained application specific i. e. aimed at identifying a specific application domain where the ingredients of hybrid soft computing have been applied quite effectively. But, to the best of our knowledge, efforts to bring in multiple domains of multidimensional data within one framework are not very frequent. In that sense, this appears to be the first such effort to accommodate cross platform applications of hybrid soft computing. Moreover, efforts of hybridization are very meager in the literature. Once successful, this will become an encouragement towards further research of interdisciplinary nature by providing scope to various research communities to come together through such an effort.

Target Audience
The proposed book would come to the benefits of several categories of students and researchers. At the students level, this book can serve as a treatise/reference book for the special papers at the masters level aimed at inspiring possibly future researchers. Newly inducted PhD aspirants would also find the contents of this book useful as far as their compulsory courseworks are concerned. At the researchers' level, those interested in interdisciplinary research would also be benefited from the book. After all, the enriched interdisciplinary contents of the book would always be a subject of interest to the faculties, existing research communities and new research aspirants from diverse disciplines of the concerned departments of premier institutes across the globe. This is expected to bring different research backgrounds (due to its cross platform characteristics) close to one another to form effective research groups all over the world. Above all, availability of the book should be ensured to as much universities and research institutes as possible through whatever graceful means it may be.

Recommended Topics
PART - I: Theoretical Foundation of Hybrid Intelligence:

Computational Intelligence: foundations and principles; neural networks; fuzzy systems; near set; soft set; evolutionary computation; rough sets; swarm intelligence
Hybridization of intelligent techniques

PART - II: Hybrid Soft Computing Paradigm:

    Algorithmic, experimental, prototyping and implementation
    Neuro-Fuzzy, Neuro-genetic, Fuzz-genetic, Neuro-fuzz-genetic architectures etc.
    Rough-fuzzy, Rough-neuro, Rough-neuro-fuzz architectures and the like
    Quantum inspired hybrid soft computing architectures

PART - III: Introduction to Multidimensional Data

    Types of Data Sets, Images, Videos, Big Data, Homogeneous and Heterogeneous data
    Characteristics of Data Sets
    Common Errors in Data Sets, Missing Values
    Outliers
    Data Correlation
    Standard Data Sets

PART - IV: Data Preprocessing

    Expert Knowledge
    Outlier Removal
    Noise Removal
    Handling of missing values
    Distance Metrics
    Data Normalization

PART - V: Characterizing the uncertainty

    Types of uncertainty, methods for handling the uncertainty
    Bootstrap confidence intervals of the phenomenon
    Permutation tests
    Randomization

PART VI: Data Handling Methods

    Dimensionality Reduction
    Singular Valued Decomposition
    Noise Removal
    Principal Component Analysis
    Independent Component Analysis
    Feature Selection
    Segmentation

PART - VI: Data Clustering

    K-means clustering
    k-nearest neighbor clustering
    fc-means clustering
    Hierarchical clustering
    Support Vector Machines
    Decision Trees
    Automatic Data Clustering Algorithms

PART - VII: Data Analysis

    Association Rules
    Multilevel Image Segmentation
    Video Segmentation
    Rough-Fuzzy Data Analysis

PART - VIII: Data Mining

    Models - Supervised, Unsupervised
    Supervised - Classification, Regression
    Unsupervised - Clustering, Latent variable models
    Cross validation

Submission Procedure
Researchers and practitioners are invited to submit on or before November 30, 2015, a chapter proposal of 1,000 to 2,000 words clearly explaining the mission and concerns of his or her proposed chapter. Full chapters are expected to be submitted by January 30, 2016. All submitted chapters will be reviewed on a double-blind review basis. Contributors may also be requested to serve as reviewers for this project.

Publisher
This book is scheduled to be published by IGI Global (formerly Idea Group Inc.), publisher of the "Information Science Reference" (formerly Idea Group Reference), "Medical Information Science Reference," "Business Science Reference," and "Engineering Science Reference" imprints. For additional information regarding the publisher, please visit www.igi-global.com. This publication is anticipated to be released in 2017.

Important Dates

November 30, 2015: Proposal Submission Deadline
January 30, 2016: Full Chapter Submission
February 28, 2016: Revised Chapter Submission

Inquiries
Dr. Siddhartha Bhattacharyya
Department of Information Technology
RCC INSTITUTE OF INFORMATION TECHNOLOGY
CANAL SOUTH ROAD, BELIAGHATA, KOLKATA – 700 015, INDIA
M: +919830354195
E-mail: [log in to unmask]