Biological Knowledge Discovery Handbook Preprocessing Mining and Postprocessing of Biological Data 1st Edition by Mourad Elloumi, Albert Zomaya, Yi Pan – Ebook PDF Instant Download/Delivery: 1118132734, 978-1118132739
Full download Biological Knowledge Discovery Handbook Preprocessing Mining and Postprocessing of Biological Data 1st edition after payment

Product details:

ISBN 10: 1118132734
ISBN 13: 978-1118132739
Author: Mourad Elloumi, Albert Zomaya, Yi Pan

The first comprehensive overview of preprocessing, mining, and postprocessing of biological data

Molecular biology is undergoing exponential growth in both the volume and complexity of biological dataand knowledge discovery offers the capacity to automate complex search and data analysis tasks. This book presents a vast overview of the most recent developments on techniques and approaches in the field of biological knowledge discovery and data mining (KDD)providing in-depth fundamental and technical field information on the most important topics encountered.

Written by top experts, Biological Knowledge Discovery Handbook: Preprocessing, Mining, and Postprocessing of Biological Data covers the three main phases of knowledge discovery (data preprocessing, data processingalso known as data miningand data postprocessing) and analyzes both verification systems and discovery systems.

Biological Knowledge Discovery Handbook Preprocessing Mining and Postprocessing of Biological Data 1st Table of contents:

Section I: Biological Data Preprocessing

Part A: Biological Data Management

Chapter 1: Genome and Transcriptome Sequence Databases for Discovery, Storage, and Representation of Alternative Splicing Events

1.1 Introduction

1.2 Splicing

1.3 Alternative Splicing

1.4 Alternative Splicing Databases

1.5 Data Mining from Alternative Splicing Databases

Acknowledgments

Web Resources

References

Chapter 2: Cleaning, Integrating, and Warehousing Genomic Data from Biomedical Resources

2.1 Introduction

2.2 Related Work

2.3 Typology of Data Quality Problems in Biomedical Resources

2.4 Cleaning, Integrating, and Warehousing Biomedical Data

2.5 Conclusions and Perspectives

Web Resources

References

Chapter 3: Cleansing of Mass Spectrometry Data for Protein Identification and Quantification

3.1 Introduction

3.2 Preprocessing Approach for Improving Protein Identification

3.3 Identification Filtering Approach for Improving Protein Identification

3.4 Evaluation Results

3.5 Conclusion

References

Chapter 4: Filtering Protein–Protein Interactions by Integration of Ontology Data

4.1 Introduction

4.2 Evaluation of Semantic Similarity

4.3 Identification of False Protein–Protein Interaction Data

4.4 Conclusion

References

Part B: Biological Data Modeling

Chapter 5: Complexity and Symmetries in DNA sequences

5.1 Introduction

5.2 Archaea

5.3 Patterns on Indicator Matrix

5.4 Measure of Complexity and Information

5.5 Complex Root Representation of DNA Words

5.6 DNA Walks

5.7 Wavelet Analysis

5.8 Algorithm of Short Haar Discrete Wavelet Transform

5.9 Conclusions

References

Chapter 6: Ontology-Driven Formal Conceptual Data Modeling for Biological Data Analysis

6.1 Introduction

6.2 Description Logics for Conceptual Data Modeling

6.3 Extensions

6.4 Automated Reasoning and Biological Knowledge Discovery

6.5 Conclusions and Outlook

References

Chapter 7: Biological Data Integration Using Network Models

7.1 Introduction

7.2 Biological Network Models

7.3 Network Models in Understanding Disease

7.4 Future Challenges

Acknowledgment

References

Chapter 8: Network Modeling of Statistical Epistasis

8.1 Introduction

8.2 Epistasis and Detection

8.3 Network

8.4 Gene-Association Interaction Network

8.5 Statistical Epistasis Networks

8.6 Concluding Remarks

Acknowledgment

References

Chapter 9: Graphical Models for Protein Function and Structure Prediction

9.1 Introduction

9.2 Graphical Models

9.3 Applications

9.4 Summary

Acknowledgments

References

Part C: Biological Feature Extraction

Chapter 10: Algorithms and Data Structures for Next-Generation Sequences

10.1 Aligners

10.2 Assemblers

References

Chapter 11: Algorithms for Next-Generation Sequencing Data

11.1 Introduction

11.2 Definitions and Notations

11.3 REAL: A Read Aligner for Mapping Short Reads to a Genome

11.4 CREAL: Mapping Short Reads to a Genome with Circular Structure

11.5 DynMap: Mapping Short Reads to Multiple Closely Related Genomes

11.6 Conclusion

References

Chapter 12: Gene Regulatory Network Identification with Qualitative Probabilistic Networks

12.1 Central Dogma: Gene Expression in a Cell

12.2 Measuring Expression Levels: Microarray Technology

12.3 Understanding Gene Regulatory Networks: Basic Concepts

12.4 Bayesian Networks for Learning GRNs

12.5 Toward Qualitative Modeling of GRNs

12.6 QPNs for Gene Regulation

12.7 Summary and Conclusions

References

Part D: Biological Feature Selection

Chapter 13: Comparing, Ranking, and Filtering Motifs with Character Classes: Application to Biological Sequences Analysis

13.1 Introduction

13.2 Motifs with Character Classes: A Characterization

13.3 Filtering by means of Underlying Motifs

13.4 Experimental Results and Discussion

13.5 Conclusion

Acknowledgments

References

Chapter 14: Stability of Feature Selection Algorithms and Ensemble Feature Selection Methods in Bioinformatics

14.1 Introduction

14.2 Feature Selection Algorithms and Instability

14.3 Ensemble Feature Selection Algorithms

14.4 Metrics for Stability Assessment

14.5 Conclusions

Acknowledgment

References

Chapter 15: Statistical Significance Assessment for Biological Feature Selection: Methods and Issues

15.1 Introduction

15.2 Statistical Significance Assessment

15.3 p-Value Distribution and π0 Estimation

15.4 Obtaining Control and Background Estimation

15.5 Statistical Significance in Integrative Analysis

15.6 Conclusions

Symbols

Acknowledgments

References

Chapter 16: Survey of Novel Feature Selection Methods for Cancer Classification

16.1 Biological Background

16.2 Introduction

16.3 Kernel-Based Feature Selection with Hilbert–Schmidt Independence Criterion

16.4 Redundancy-Based Gene Selection

16.5 Unsupervised Feature Selection

16.6 Summary of Algorithms

16.7 Conclusion

References

Chapter 17: Information-Theoretic Gene Selection in Expression Data

17.1 Introduction

17.2 Curse of Dimensionality

17.3 Variable Selection Exploration Strategies

17.4 Relevance, Redundancy, and Synergy

17.5 Information-Theoretic Filters

17.6 Fast Mutual Information Estimation

17.7 Conclusions

References

Chapter 18: Feature Selection and Classification for Gene Expression Data Using Evolutionary Computation

18.1 Introduction

18.2 Preliminaries

18.3 Evolutionary Reduct Generation

18.4 Experimental Results

18.5 Conclusion

References

Section II: Biological Data Mining

Part E: Regression Analysis of Biological Data

Chapter 19: Building Valid Regression Models for Biological Data Using Stata and R

19.1 Introduction

19.2 Fitting the Model

19.3 Validity of the Model

19.4 Nonconstant Variance and Variable Transformation

19.5 Marginal Model Plots

19.6 Patterns in Residual Plots

19.7 Variable Selection

References

Chapter 20: Logistic Regression in Genomewide Association Analysis

20.1 Introduction

20.2 Single Genetic Marker: Basic Concepts

20.3 Single Genetic Marker: Statistical Tests

20.4 Two Genetic Markers and Fisher’s Nonadditivity Interaction

20.5 Many Genetic Markers in Genomewide Association Analysis: Variable Reduction and Penalized Regression

20.6 Latent Variables and Dimension Reduction: Partial Least-Squares Regression

20.7 Latent Variables: Logic Regression

20.8 Discussion

Appendix: Matrix Representation of Partial Least-Squares Regression

Acknowledgments

References

Chapter 21: Semiparametric Regression Methods in Longitudinal Data: Applications to AIDS Clinical Trial Data

21.1 Introduction

21.2 Modeling a Single Treatment Group Using a Semiparametric Partially Linear Model

21.3 Modeling Within-Subject Covariance

21.4 Modeling Multiple Treatment Groups

21.5 Summary

Acknowledgment

References

Part F: Biological Data Clustering

Chapter 22: The Three Steps of Clustering in the Post-Genomic Era

22.1 Introduction

22.2 Experimental Set-Up

22.3 Distances

22.4 Clustering Algorithms

22.5 Internal Validation Measures

22.6 Conclusions

Acknowledgment

References

Chapter 23: Clustering Algorithms of Microarray Data

23.1 Introduction

23.2 Geometric Clustering Algorithms

23.3 Model-Based Clustering Algorithms

23.4 Formal Concept–Based Clustering Algorithms

23.5 Clustering Webtools

23.6 Microarray Data Sets

23.7 Conclusion

References

Chapter 24: Spread of Evaluation Measures for Microarray Clustering

24.1 Introduction

24.2 Search Procedure and Classification of Evaluation Measures

24.3 Internal Measures

24.4 External Measures

24.5 Biological Measures

24.6 Discussion

24.7 Data Sets

24.8 Conclusions

References

Chapter 25: Survey on Biclustering of Gene Expression Data

25.1 Introduction

25.2 Types of Biclusters

25.3 Groups of Biclusters

25.4 Evaluation Functions

25.5 Systematic and Stochastic Biclustering Algorithms

25.6 Bicluster Validation

25.7 Conclusion

Acknowledgments

References

Chapter 26: Multiobjective Biclustering of Gene Expression Data with Bioinspired Algorithms

26.1 Introduction

26.2 Biclustering Problem in Microarray Data

26.3 Multiobjective Model for Biclustering in Gene Expression Data

26.4 Bioinspired Algorithms for Biclustering

26.5 Results and Discussions

26.6 Conclusion

References

Chapter 27: Coclustering Under Gene Ontology Derived Constraints for Pathway Identification

27.1 Introduction

27.2 Related Work

27.3 Constrained Coclustering

27.4 Parameterless Methodology for GO-driven Coclustering

27.5 Case Study

27.6 Conclusion

References

Part G: Biological Data Classification

Chapter 28: Survey on Fingerprint Classification Methods for Biological Sequences

28.1 Introduction

28.2 Basic Definitions and Problem Statements

28.3 Overview of Various Classification Approaches

28.4 Missing-Value Estimation Methods

28.5 Fingerprint Classification: Combinatorial Approach for Estimating Missing Values

Acknowledgments

References

Chapter 29: Microarray Data Analysis: From Preparation to Classification

29.1 Introduction

29.2 Experiment Design

29.3 Normalization

29.4 Ranking

29.5 Brief Review of Approaches of Microarray Data Classification

29.6 MIDClass: A Novel Approach to Effective Microarray Data Classification

29.7 Experimental Study

29.8 Conclusion

References

Chapter 30: Diversified Classifier Fusion Technique for Gene Expression Data

30.1 Introduction

30.2 Background Study

30.3 Preliminaries

30.4 Proposed Model

30.5 Experimental Evaluation

30.6 Conclusion

References

Chapter 31: RNA Classification and Structure Prediction: Algorithms and Case Studies

31.1 Introduction

31.2 Classification of RNA Sequences

31.3 In Silico Prediction of RNA Pseudoknots

31.4 Conclusion

References

Chapter 32: Ab Initio Protein Structure Prediction: Methods and Challenges

32.1 Introduction

32.2 Protein-Folding Problem Milestones at a Glance

32.3 Ab Initio Protein Structure Prediction

32.4 Pure Ab Initio Prediction

32.5 Ab Initio Prediction with Database Information

32.6 Discussion and Challenges

32.7 Appendix: CASP9

References

Chapter 33: Overview of Classification Methods to Support HIV/AIDS Clinical Decision Making

33.1 Predicting Resistance to Drugs

33.2 Predicting Coreceptor Usage

33.3 Identifying Subtype

33.4 Identifying Mutation Selection Pressure

33.5 Making Treatment-related Decisions

33.6 Future Directions

33.7 Conclusion

References

Part H: Association Rules Learning from Biological Data

Chapter 34: Mining Frequent Patterns and Association Rules from Biological Data

34.1 Introduction

34.2 Definition of AR Mining Problem

34.3 Algorithms for Mining ARs

34.4 Preprocessing and Postprocessing

34.5 Gene Expression Data Mining

34.6 Sequential Data Mining

34.7 Structural Data Mining

34.8 Protein Interactions: Graph Data Mining

34.9 Text Mining

34.10 Conclusion

References

Chapter 35: Galois Closure Based Association Rule Mining from Biological Data

35.1 Introduction

35.2 Association Rule Mining Frameworks

35.3 Condensed Representations of Association Rules

35.4 Interestingness Measures

35.5 Biological Applications

35.6 Conclusion

References

Chapter 36: Inference of Gene Regulatory Networks Based on Association Rules

36.1 Introduction

36.2 Data Mining and Inference of GRNs based on ARs

36.3 Techniques of Inference of GRNs based on AR

36.4 Concluding Remarks

Acknowledgments

References

Part I: Text Mining and Application to Biological Data

Chapter 37: Current Methodologies for Biomedical Named Entity Recognition

37.1 Introduction

37.2 Preliminaries

37.3 Dictionary-Based Approaches

37.4 ML-Based Approaches

37.5 Hybrid Approaches

37.6 Use Cases

37.7 Conclusion

References

Chapter 38: Automated Annotation of Scientific Documents: Increasing Access to Biological Knowledge

38.1 Introduction

38.2 Survey of Tools

38.3 Technologies and Techniques

38.4 Discussion

38.5 Future Perspectives

Glossary

Acknowledgments

References

Chapter 39: Augmenting Biological Text Mining with Symbolic Inference

39.1 Introduction

39.2 Identifying Implied Information

39.3 Predicting New Hypotheses

39.4 Text Mining with Distributional Analysis

39.5 Discussion and Conclusion

Acknowledgments

References

Chapter 40: Web Content Mining for Learning Generic Relations and their Associations from Textual Biological Data

40.1 Introduction

40.2 State-of-the-Art in Biological Relation Mining

40.3 Proposed Biological Relation-Mining System

40.4 Performance Evaluation

40.5 Uniqueness of Proposed Biological Relation-Mining System

40.6 Conclusion and Future Work

References

Chapter 41: Protein–Protein Relation Extraction from Biomedical Abstracts

41.1 Introduction

41.2 BioEve: BioMolecular Event Extractor

41.3 Sentence-Level Classification and Semantic Labeling

41.4 Event Extraction Using Dependency Parsing

41.5 Experiments and Evaluations

41.6 Conclusions

Acknowledgments

References

Part J: High-Performance Computing for Biological Data Mining

Chapter 42: Accelerating Pairwise Alignment Algorithms by Using Graphics Processor Units

42.1 Introduction

42.2 Pairwise Alignment Algorithms

42.3 Graphics Processor Units

42.4 Accelerating Pairwise Alignment Algorithms

42.5 Conclusion

References

Chapter 43: High-Performance Computing in High-Throughput Sequencing

43.1 Introduction

43.2 Next-Generation Sequencing Applications

43.3 High-Performance Computing Architectures: Short Summary

43.4 High-Performance Computing on Next-Generation Sequencing Data

43.5 Summary

References

Chapter 44: Large-scale clustering of short reads for metagenomics on GPUs

44.1 Introduction

44.2 Background

44.3 Pairwise Global Alignment

44.4 GPU programming

44.5 CRiSPy-CUDA

44.6 Experiments

44.7 Conclusions

References

Section III: Biological Data Postprocessing

Part K: Biological Knowledge Integration and Visualization

Chapter 45: Integration of Metabolic Knowledge for Genome-Scale Metabolic Reconstruction

45.1 Introduction

45.2 Omics ERA

45.3 Metabolic Network Modeling

45.4 History of Genome-Scale Models

45.5 How Genome-Scale Metabolic Models Can Be Generated

45.6 Applications

45.7 Biochemical Pathways and Genome Annotation Databases

45.8 Conclusion

References

Chapter 46: Inferring and Postprocessing Huge Phylogenies

46.1 Introduction

46.2 Recent Advances

46.3 Data Avalanche: Example with rbcL

46.4 Future Challenges and Opportunities

46.5 Conclusion

Acknowledgment

References

Chapter 47: Biological Knowledge Visualization

47.1 Introduction

47.2 Information Visualization and Visual Analytics

47.3 Biological Data Types

47.4 Biological Data Visualization Issues

47.5 Sequence Data Visualization

47.6 Relational and Functional Data Visualization

47.7 Expression Data Visualization

47.8 Structure Data Visualization

47.9 Conclusion and Future Perspectives

References

Chapter 48: Visualization of Biological Knowledge Based on Multimodal Biological Data

48.1 Introduction

48.2 Multimodal Biological Data

48.3 Approaches to Discover Knowledge from Multimodal Biological Data

48.4 Novel Approach for Visualization and Discovery of Biological Knowledge Based on Multimodal Biological Data

48.5 Conclusion

People also search for Biological Knowledge Discovery Handbook Preprocessing Mining and Postprocessing of Biological Data 1st :

knowledge discovery data mining (kdd)

biological databases and data mining (biol-ga 1009)

knowledge discovery data mining

discovering knowledge in data an introduction to data mining

biological knowledge discovery handbook preprocessing mining

Tags: Mourad Elloumi, Albert Zomaya, Yi Pan, Biological Knowledge, Preprocessing Mining

Sign up for Newsletter

Biological Knowledge Discovery Handbook Preprocessing Mining and Postprocessing of Biological Data 1st Edition by Mourad Elloumi, Albert Zomaya, Yi Pan ISBN 1118132734 978-1118132739

Product details:

Biological Knowledge Discovery Handbook Preprocessing Mining and Postprocessing of Biological Data 1st Table of contents:

People also search for Biological Knowledge Discovery Handbook Preprocessing Mining and Postprocessing of Biological Data 1st :

Sign up for Newsletter

Biological Knowledge Discovery Handbook Preprocessing Mining and Postprocessing of Biological Data 1st Edition by Mourad Elloumi, Albert Zomaya, Yi Pan ISBN 1118132734 978-1118132739

Product details:

Biological Knowledge Discovery Handbook Preprocessing Mining and Postprocessing of Biological Data 1st Table of contents:

People also search for Biological Knowledge Discovery Handbook Preprocessing Mining and Postprocessing of Biological Data 1st :

Login