Biological Knowledge Discovery Handbook Preprocessing Mining and Postprocessing of Biological Data 1st Edition by Mourad Elloumi, Albert Zomaya, Yi Pan – Ebook PDF Instant Download/Delivery: 1118132734, 978-1118132739
Full download Biological Knowledge Discovery Handbook Preprocessing Mining and Postprocessing of Biological Data 1st edition after payment

Product details:
ISBN 10: 1118132734
ISBN 13: 978-1118132739
Author: Mourad Elloumi, Albert Zomaya, Yi Pan
The first comprehensive overview of preprocessing, mining, and postprocessing of biological data
Molecular biology is undergoing exponential growth in both the volume and complexity of biological dataand knowledge discovery offers the capacity to automate complex search and data analysis tasks. This book presents a vast overview of the most recent developments on techniques and approaches in the field of biological knowledge discovery and data mining (KDD)providing in-depth fundamental and technical field information on the most important topics encountered.
Written by top experts, Biological Knowledge Discovery Handbook: Preprocessing, Mining, and Postprocessing of Biological Data covers the three main phases of knowledge discovery (data preprocessing, data processingalso known as data miningand data postprocessing) and analyzes both verification systems and discovery systems.
Biological Knowledge Discovery Handbook Preprocessing Mining and Postprocessing of Biological Data 1st Table of contents:
Section I: Biological Data Preprocessing
Part A: Biological Data Management
Chapter 1: Genome and Transcriptome Sequence Databases for Discovery, Storage, and Representation of Alternative Splicing Events
1.1 Introduction
1.2 Splicing
1.3 Alternative Splicing
1.4 Alternative Splicing Databases
1.5 Data Mining from Alternative Splicing Databases
Acknowledgments
Web Resources
References
Chapter 2: Cleaning, Integrating, and Warehousing Genomic Data from Biomedical Resources
2.1 Introduction
2.2 Related Work
2.3 Typology of Data Quality Problems in Biomedical Resources
2.4 Cleaning, Integrating, and Warehousing Biomedical Data
2.5 Conclusions and Perspectives
Web Resources
References
Chapter 3: Cleansing of Mass Spectrometry Data for Protein Identification and Quantification
3.1 Introduction
3.2 Preprocessing Approach for Improving Protein Identification
3.3 Identification Filtering Approach for Improving Protein Identification
3.4 Evaluation Results
3.5 Conclusion
References
Chapter 4: Filtering Protein–Protein Interactions by Integration of Ontology Data
4.1 Introduction
4.2 Evaluation of Semantic Similarity
4.3 Identification of False Protein–Protein Interaction Data
4.4 Conclusion
References
Part B: Biological Data Modeling
Chapter 5: Complexity and Symmetries in DNA sequences
5.1 Introduction
5.2 Archaea
5.3 Patterns on Indicator Matrix
5.4 Measure of Complexity and Information
5.5 Complex Root Representation of DNA Words
5.6 DNA Walks
5.7 Wavelet Analysis
5.8 Algorithm of Short Haar Discrete Wavelet Transform
5.9 Conclusions
References
Chapter 6: Ontology-Driven Formal Conceptual Data Modeling for Biological Data Analysis
6.1 Introduction
6.2 Description Logics for Conceptual Data Modeling
6.3 Extensions
6.4 Automated Reasoning and Biological Knowledge Discovery
6.5 Conclusions and Outlook
References
Chapter 7: Biological Data Integration Using Network Models
7.1 Introduction
7.2 Biological Network Models
7.3 Network Models in Understanding Disease
7.4 Future Challenges
Acknowledgment
References
Chapter 8: Network Modeling of Statistical Epistasis
8.1 Introduction
8.2 Epistasis and Detection
8.3 Network
8.4 Gene-Association Interaction Network
8.5 Statistical Epistasis Networks
8.6 Concluding Remarks
Acknowledgment
References
Chapter 9: Graphical Models for Protein Function and Structure Prediction
9.1 Introduction
9.2 Graphical Models
9.3 Applications
9.4 Summary
Acknowledgments
References
Part C: Biological Feature Extraction
Chapter 10: Algorithms and Data Structures for Next-Generation Sequences
10.1 Aligners
10.2 Assemblers
References
Chapter 11: Algorithms for Next-Generation Sequencing Data
11.1 Introduction
11.2 Definitions and Notations
11.3 REAL: A Read Aligner for Mapping Short Reads to a Genome
11.4 CREAL: Mapping Short Reads to a Genome with Circular Structure
11.5 DynMap: Mapping Short Reads to Multiple Closely Related Genomes
11.6 Conclusion
References
Chapter 12: Gene Regulatory Network Identification with Qualitative Probabilistic Networks
12.1 Central Dogma: Gene Expression in a Cell
12.2 Measuring Expression Levels: Microarray Technology
12.3 Understanding Gene Regulatory Networks: Basic Concepts
12.4 Bayesian Networks for Learning GRNs
12.5 Toward Qualitative Modeling of GRNs
12.6 QPNs for Gene Regulation
12.7 Summary and Conclusions
References
Part D: Biological Feature Selection
Chapter 13: Comparing, Ranking, and Filtering Motifs with Character Classes: Application to Biological Sequences Analysis
13.1 Introduction
13.2 Motifs with Character Classes: A Characterization
13.3 Filtering by means of Underlying Motifs
13.4 Experimental Results and Discussion
13.5 Conclusion
Acknowledgments
References
Chapter 14: Stability of Feature Selection Algorithms and Ensemble Feature Selection Methods in Bioinformatics
14.1 Introduction
14.2 Feature Selection Algorithms and Instability
14.3 Ensemble Feature Selection Algorithms
14.4 Metrics for Stability Assessment
14.5 Conclusions
Acknowledgment
References
Chapter 15: Statistical Significance Assessment for Biological Feature Selection: Methods and Issues
15.1 Introduction
15.2 Statistical Significance Assessment
15.3 p-Value Distribution and π0 Estimation
15.4 Obtaining Control and Background Estimation
15.5 Statistical Significance in Integrative Analysis
15.6 Conclusions
Symbols
Acknowledgments
References
Chapter 16: Survey of Novel Feature Selection Methods for Cancer Classification
16.1 Biological Background
16.2 Introduction
16.3 Kernel-Based Feature Selection with Hilbert–Schmidt Independence Criterion
16.4 Redundancy-Based Gene Selection
16.5 Unsupervised Feature Selection
16.6 Summary of Algorithms
16.7 Conclusion
References
Chapter 17: Information-Theoretic Gene Selection in Expression Data
17.1 Introduction
17.2 Curse of Dimensionality
17.3 Variable Selection Exploration Strategies
17.4 Relevance, Redundancy, and Synergy
17.5 Information-Theoretic Filters
17.6 Fast Mutual Information Estimation
17.7 Conclusions
References
Chapter 18: Feature Selection and Classification for Gene Expression Data Using Evolutionary Computation
18.1 Introduction
18.2 Preliminaries
18.3 Evolutionary Reduct Generation
18.4 Experimental Results
18.5 Conclusion
References
Section II: Biological Data Mining
Part E: Regression Analysis of Biological Data
Chapter 19: Building Valid Regression Models for Biological Data Using Stata and R
19.1 Introduction
19.2 Fitting the Model
19.3 Validity of the Model
19.4 Nonconstant Variance and Variable Transformation
19.5 Marginal Model Plots
19.6 Patterns in Residual Plots
19.7 Variable Selection
References
Chapter 20: Logistic Regression in Genomewide Association Analysis
20.1 Introduction
20.2 Single Genetic Marker: Basic Concepts
20.3 Single Genetic Marker: Statistical Tests
20.4 Two Genetic Markers and Fisher’s Nonadditivity Interaction
20.5 Many Genetic Markers in Genomewide Association Analysis: Variable Reduction and Penalized Regression
20.6 Latent Variables and Dimension Reduction: Partial Least-Squares Regression
20.7 Latent Variables: Logic Regression
20.8 Discussion
Appendix: Matrix Representation of Partial Least-Squares Regression
Acknowledgments
References
Chapter 21: Semiparametric Regression Methods in Longitudinal Data: Applications to AIDS Clinical Trial Data
21.1 Introduction
21.2 Modeling a Single Treatment Group Using a Semiparametric Partially Linear Model
21.3 Modeling Within-Subject Covariance
21.4 Modeling Multiple Treatment Groups
21.5 Summary
Acknowledgment
References
Part F: Biological Data Clustering
Chapter 22: The Three Steps of Clustering in the Post-Genomic Era
22.1 Introduction
22.2 Experimental Set-Up
22.3 Distances
22.4 Clustering Algorithms
22.5 Internal Validation Measures
22.6 Conclusions
Acknowledgment
References
Chapter 23: Clustering Algorithms of Microarray Data
23.1 Introduction
23.2 Geometric Clustering Algorithms
23.3 Model-Based Clustering Algorithms
23.4 Formal Concept–Based Clustering Algorithms
23.5 Clustering Webtools
23.6 Microarray Data Sets
23.7 Conclusion
References
Chapter 24: Spread of Evaluation Measures for Microarray Clustering
24.1 Introduction
24.2 Search Procedure and Classification of Evaluation Measures
24.3 Internal Measures
24.4 External Measures
24.5 Biological Measures
24.6 Discussion
24.7 Data Sets
24.8 Conclusions
References
Chapter 25: Survey on Biclustering of Gene Expression Data
25.1 Introduction
25.2 Types of Biclusters
25.3 Groups of Biclusters
25.4 Evaluation Functions
25.5 Systematic and Stochastic Biclustering Algorithms
25.6 Bicluster Validation
25.7 Conclusion
Acknowledgments
References
Chapter 26: Multiobjective Biclustering of Gene Expression Data with Bioinspired Algorithms
26.1 Introduction
26.2 Biclustering Problem in Microarray Data
26.3 Multiobjective Model for Biclustering in Gene Expression Data
26.4 Bioinspired Algorithms for Biclustering
26.5 Results and Discussions
26.6 Conclusion
References
Chapter 27: Coclustering Under Gene Ontology Derived Constraints for Pathway Identification
27.1 Introduction
27.2 Related Work
27.3 Constrained Coclustering
27.4 Parameterless Methodology for GO-driven Coclustering
27.5 Case Study
27.6 Conclusion
References
Part G: Biological Data Classification
Chapter 28: Survey on Fingerprint Classification Methods for Biological Sequences
28.1 Introduction
28.2 Basic Definitions and Problem Statements
28.3 Overview of Various Classification Approaches
28.4 Missing-Value Estimation Methods
28.5 Fingerprint Classification: Combinatorial Approach for Estimating Missing Values
Acknowledgments
References
Chapter 29: Microarray Data Analysis: From Preparation to Classification
29.1 Introduction
29.2 Experiment Design
29.3 Normalization
29.4 Ranking
29.5 Brief Review of Approaches of Microarray Data Classification
29.6 MIDClass: A Novel Approach to Effective Microarray Data Classification
29.7 Experimental Study
29.8 Conclusion
References
Chapter 30: Diversified Classifier Fusion Technique for Gene Expression Data
30.1 Introduction
30.2 Background Study
30.3 Preliminaries
30.4 Proposed Model
30.5 Experimental Evaluation
30.6 Conclusion
References
Chapter 31: RNA Classification and Structure Prediction: Algorithms and Case Studies
31.1 Introduction
31.2 Classification of RNA Sequences
31.3 In Silico Prediction of RNA Pseudoknots
31.4 Conclusion
References
Chapter 32: Ab Initio Protein Structure Prediction: Methods and Challenges
32.1 Introduction
32.2 Protein-Folding Problem Milestones at a Glance
32.3 Ab Initio Protein Structure Prediction
32.4 Pure Ab Initio Prediction
32.5 Ab Initio Prediction with Database Information
32.6 Discussion and Challenges
32.7 Appendix: CASP9
References
Chapter 33: Overview of Classification Methods to Support HIV/AIDS Clinical Decision Making
33.1 Predicting Resistance to Drugs
33.2 Predicting Coreceptor Usage
33.3 Identifying Subtype
33.4 Identifying Mutation Selection Pressure
33.5 Making Treatment-related Decisions
33.6 Future Directions
33.7 Conclusion
References
Part H: Association Rules Learning from Biological Data
Chapter 34: Mining Frequent Patterns and Association Rules from Biological Data
34.1 Introduction
34.2 Definition of AR Mining Problem
34.3 Algorithms for Mining ARs
34.4 Preprocessing and Postprocessing
34.5 Gene Expression Data Mining
34.6 Sequential Data Mining
34.7 Structural Data Mining
34.8 Protein Interactions: Graph Data Mining
34.9 Text Mining
34.10 Conclusion
References
Chapter 35: Galois Closure Based Association Rule Mining from Biological Data
35.1 Introduction
35.2 Association Rule Mining Frameworks
35.3 Condensed Representations of Association Rules
35.4 Interestingness Measures
35.5 Biological Applications
35.6 Conclusion
References
Chapter 36: Inference of Gene Regulatory Networks Based on Association Rules
36.1 Introduction
36.2 Data Mining and Inference of GRNs based on ARs
36.3 Techniques of Inference of GRNs based on AR
36.4 Concluding Remarks
Acknowledgments
References
Part I: Text Mining and Application to Biological Data
Chapter 37: Current Methodologies for Biomedical Named Entity Recognition
37.1 Introduction
37.2 Preliminaries
37.3 Dictionary-Based Approaches
37.4 ML-Based Approaches
37.5 Hybrid Approaches
37.6 Use Cases
37.7 Conclusion
References
Chapter 38: Automated Annotation of Scientific Documents: Increasing Access to Biological Knowledge
38.1 Introduction
38.2 Survey of Tools
38.3 Technologies and Techniques
38.4 Discussion
38.5 Future Perspectives
Glossary
Acknowledgments
References
Chapter 39: Augmenting Biological Text Mining with Symbolic Inference
39.1 Introduction
39.2 Identifying Implied Information
39.3 Predicting New Hypotheses
39.4 Text Mining with Distributional Analysis
39.5 Discussion and Conclusion
Acknowledgments
References
Chapter 40: Web Content Mining for Learning Generic Relations and their Associations from Textual Biological Data
40.1 Introduction
40.2 State-of-the-Art in Biological Relation Mining
40.3 Proposed Biological Relation-Mining System
40.4 Performance Evaluation
40.5 Uniqueness of Proposed Biological Relation-Mining System
40.6 Conclusion and Future Work
References
Chapter 41: Protein–Protein Relation Extraction from Biomedical Abstracts
41.1 Introduction
41.2 BioEve: BioMolecular Event Extractor
41.3 Sentence-Level Classification and Semantic Labeling
41.4 Event Extraction Using Dependency Parsing
41.5 Experiments and Evaluations
41.6 Conclusions
Acknowledgments
References
Part J: High-Performance Computing for Biological Data Mining
Chapter 42: Accelerating Pairwise Alignment Algorithms by Using Graphics Processor Units
42.1 Introduction
42.2 Pairwise Alignment Algorithms
42.3 Graphics Processor Units
42.4 Accelerating Pairwise Alignment Algorithms
42.5 Conclusion
References
Chapter 43: High-Performance Computing in High-Throughput Sequencing
43.1 Introduction
43.2 Next-Generation Sequencing Applications
43.3 High-Performance Computing Architectures: Short Summary
43.4 High-Performance Computing on Next-Generation Sequencing Data
43.5 Summary
References
Chapter 44: Large-scale clustering of short reads for metagenomics on GPUs
44.1 Introduction
44.2 Background
44.3 Pairwise Global Alignment
44.4 GPU programming
44.5 CRiSPy-CUDA
44.6 Experiments
44.7 Conclusions
References
Section III: Biological Data Postprocessing
Part K: Biological Knowledge Integration and Visualization
Chapter 45: Integration of Metabolic Knowledge for Genome-Scale Metabolic Reconstruction
45.1 Introduction
45.2 Omics ERA
45.3 Metabolic Network Modeling
45.4 History of Genome-Scale Models
45.5 How Genome-Scale Metabolic Models Can Be Generated
45.6 Applications
45.7 Biochemical Pathways and Genome Annotation Databases
45.8 Conclusion
References
Chapter 46: Inferring and Postprocessing Huge Phylogenies
46.1 Introduction
46.2 Recent Advances
46.3 Data Avalanche: Example with rbcL
46.4 Future Challenges and Opportunities
46.5 Conclusion
Acknowledgment
References
Chapter 47: Biological Knowledge Visualization
47.1 Introduction
47.2 Information Visualization and Visual Analytics
47.3 Biological Data Types
47.4 Biological Data Visualization Issues
47.5 Sequence Data Visualization
47.6 Relational and Functional Data Visualization
47.7 Expression Data Visualization
47.8 Structure Data Visualization
47.9 Conclusion and Future Perspectives
References
Chapter 48: Visualization of Biological Knowledge Based on Multimodal Biological Data
48.1 Introduction
48.2 Multimodal Biological Data
48.3 Approaches to Discover Knowledge from Multimodal Biological Data
48.4 Novel Approach for Visualization and Discovery of Biological Knowledge Based on Multimodal Biological Data
48.5 Conclusion
People also search for Biological Knowledge Discovery Handbook Preprocessing Mining and Postprocessing of Biological Data 1st :
knowledge discovery data mining (kdd)
biological databases and data mining (biol-ga 1009)
knowledge discovery data mining
discovering knowledge in data an introduction to data mining
biological knowledge discovery handbook preprocessing mining
Tags: Mourad Elloumi, Albert Zomaya, Yi Pan, Biological Knowledge, Preprocessing Mining


