Biological Data Modelling and Scripting in R (original) (raw)
Systems and Computational Biology-Bioinformatics and Computational Modeling 262 more widely. R is maintained by a core group of experts, thus ensuring its availability for long life. R in its repository also has a number of packages useful in various fields of biology. These packages help solve biological problems in well-structured manner saving time and money. 3. Data modeling for R Data modeling for R involves identification of the datasets required for the corresponding problem undertaken. The data in the datasets needs to be structured into relevant rows and columns. For each field or column only one data type is allowed either character or numeric data type. Thereafter standardization or pre-processing of the data in datasets needs to be done. This involves checking the data for any inconsistencies-e.g., removal of blank cells by replacing with "Not known" or "None", checking header names for unwanted symbols like ?@$%*^ #/, checking columns for single data-type etc. The datasets may be then made into R object. Thus data modeling for R plays an important role to make data easily and properly read and operated with scripts in R platform. The data type in each column must conform to same format for all cells in that column. 4. S4 object oriented programming S4 is the 4 th version of S. The major development of S4 over S3 is the integration of functions, which allows considering S as an object oriented language. The object system in S4 provides a rich way of defining classes, handling inheritance, setting generic methods, validity checking and multiple dispatches. This allows development of easy to operate packages for rapid data handling and organized structured framework. 4.1 Setting class and reading data into S4 objects Classes with specific representations are created in S4. Thereafter new object belonging to the set class may be created. Generic functions may also be made using object of the class: 1. setClass() is used to set the class of a data 2. new()is used to create objects of the class set 3. setGeneric() helps define generics 4. setMethods() is used to set methods 5. Decision tree A decision tree (Maimon et al., 2005) is a tree like graph that a decision maker can create to help select the best amongst several alternative courses of action. Biological problems can be solved with help of well-structured and optimized algorithms. These algorithms can be represented in the form of decision trees to get better and clear understanding of the algorithm process followed to solve the biological problem.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.