Day 11 Assignment

30 Jul

prolog1 prolog2prolog2out

Advertisements

Day 11 Journal

30 Jul

In my opinion, I think there is a middle ground on free and closed software, where I think that programmers can choose parts of software to share and parts to keep private through wrapping all code into functions and classes with private access. When programmers begin to create their own Frankenstein monsters using code from multiple sources, they can import a kind of “locked” chunk of code that can be used, but not accessed. So, the use of the software can be entirely regulated by its original creator in a closed software approach.

 

imgres

Image

Day 10 Assignment – EPGY Author screeshot

29 Jul

Capture4

Day 10 Journal

29 Jul

Where both thermostats and thermometers are changed and affected by their environment with little user input, the thermostat can change and affect its environment , showing a difference between autonomy and agency.  Autonomy is the concept where a computer can function with no user input at all, and Agency, is a concept where the computer has influence over its own environment. In my opinion, these two concepts are completely disconnected from each other, since something can have agency while it is not autonomous, and something can be autonomous without agency (the thermostat).

mzl.atkciufz.320x480-75

 

 

Final Project – Bioinformatics and Biocomputation

26 Jul

Abstract

This paper will give a basic overview of the modern field of Bioinformatics, which involves the application of Statistics and Computer Science into newly acquired genomic sequences in Biology. Bioinformatics is the general field that involves both manipulating and applying biological data through analysis involving informational sciences like statistics. One of the fastest growing and most notable subfields of Bioinformatics is the manipulation of the recently sequenced Human Genome, since it involves the basic genetic information that creates and defines human beings.

Introduction

Since 2003, the Human Genome has been entirely into its basic components, meaning the 23 haploid chromosomes found with some variation in every human that contains all genetic information had been mapped into individual nucleotide sequences. Everything that makes a person human is found within combinations of nucleic acids with four major nitrogenous bases: Adenine, Thymine, Cytosine, and Guanine. Which are expressed through protein synthesis based off of collections of 3 bases forming codons. Basically, eukaryotic protein synthesis involves isolated sequences of DNA, which contains nucleotide base pairing, that is transcript into an mRNA polymer by matching base pairs, that is modified and translated into a polypeptide through dehydration synthesis of amino acid strands off of mRNA codons, that is modified again and folded into a functioning protein. However, all of this genetic information and nucleotide sequences means nothing if the data cannot be interpreted. Bioinformatics is the application of informational sciences into biological processes. With over 3 billion base pairs in the human genome alone, computer science and statistics are necessary to recognize patterns of coding and non-coding segments, and since there is variation in every human genome, it can be very difficult to distinguish unfavorable genes such as mutated or modified proto-oncogenes causing cancer from regular nucleotide variation between people.

cropped genome_timeline_web

Application

As mentioned before, comparing healthy and problematic genetic sequences can help improve on genetic-linked diseases like cancer. By comparing and analyzing different genomes, Bioinformatics opens up new opportunities to study life and development by recognizing different functions of specific segments. However, most genetic diseases come from a combination of environmental and genetic factors. Manipulating and analyzing Genomes also opens new ways of studying Biology based on basic defining characteristics of organisms .One of the potential applications in the future is the idea of DNA storage that takes advantage of the dense and small, but distinguishable base sequences of DNA to store digital data, while it is costly, technology is expected to be developed for easier and efficient use.

Previous Work

The two earliest “founders” of Bioinformatics are Elvin Kabat, who determined the major structures of antigens involved in blood type and started to analyze biological sequences of antibodies, and Margaret Dayhoff, who developed programmable methods of comparing protein amino acid sequences. Also, thousands of other organisms had their genomes sequenced as of 2012.

Problems in the Area

One of the most significant problems in the area of Bioinformatics is the sheer size of data that needs to be analyzed with accuracy, since most analyzed processes and sequences involve large amounts of sequences. Alignment is a problem in Bioinformatics where segments of the Genome are compared for similarity for other segments. In Local Alignment, certain substrings of the sequence are compared to form a larger similarities in sequences.Also, majority of all genetic information found in eukaryotic DNA does not actually have a clear cut function for coding proteins. Most of eukaryotic DNA is involved in gene expression of other segments of DNA, meaning that, unlike in computers, with precise commands and functions, the same segment of DNA in a cell will have multiple functions on different occasions due to splicing introns and exons, and other processes that modify the mRNA transcripted from the DNA, making the process of interpreting sequences very difficult.

Untitled - Copy

Proposed Solutions

AI Algorithms have been used to search through biological data, including the Markov Chain, which can approximate sequences to some degree of accuracy while maintaining an efficient running time for the large amounts of data and to cut out noise and redundancies. Also, in sequencing large amounts of DNA strands, “shotgun sequencing” is applied to divides the DNA itself into smaller fragments that are sequenced and stitched together based on certain patterns at the end of each DNA strand. There is also a method called “pathway analysis” that networks the protein products to find out how they interact, faulty proteins can be traced to mutations in the DNA. For Alignment, one of the earlier techniques is called the Smith and Waterman algorithm,  that involves scoring local base pairs based on degree of similarity, and applying dynamic programming to set “checkpoints” when the score gets below zero

.Aliginment

In this scoring method, there is a preset matrix of similarity values (on left) and the matrix is treated as a weighted adjacency matrix to form the edges on the graph(on right). The score is determined in each node or connection by  the best possible path through any of its neighbors, which are formed by the base pairs. In this example, the connection T-A forms the edge of -1 based on the matrix, while the edge T-T forms the edge of 3, which is the better score.

The substrings in the  Smith and Waterman algorithm are determined between the previous net score that was below 0, and next net alignment score below 0 to apply dynamic programming. Since alignment is done on a local level, the scores are then applied to get the larger global alignment score.

Conclusion

Overall, Bioinformatics is a new field that merges two other subjects to make efficient use of a new pool of data that has many potential applications in health and life science. By applying the more efficient algorithms of Artificial Intelligence and Computer Science, the process of sequencing can be faster and more practical to help both parent fields.

Future Work

References

http://www.bioplanet.com/what-is-bioinformatics/

http://en.wikipedia.org/wiki/Margaret_Oakley_Dayhoff

http://en.wikipedia.org/wiki/Elvin_A._Kabat

http://en.wikipedia.org/wiki/Bioinformatics

http://en.wikipedia.org/wiki/Shotgun_sequencing

http://www.seas.gwu.edu/~simhaweb/cs151/lectures/module12/align.html

http://med.stanford.edu/ism/2013/march/bil-gates.html

http://en.wikipedia.org/wiki/Sanger_sequencing

Book

book

http://www.amazon.com/Bioinformatics-For-Dummies-Jean-Michel-Claverie/dp/0470089857/ref=sr_1_1?ie=UTF8&qid=1374868225&sr=8-1&keywords=bioinformatics

Day 9 Journal – Conference

26 Jul

Confrence Logo

This is the International Workshop for Data Mining in Bioinformatics that will meet on August 10th, 2013 in Chicago.

I am mostly interested in two talks in the conference, the presentation on Signal Detection in Genomic Sequences for its explanation on pattern recognition in DNA sequences its function  and The Systems Biology talk on Cellular Aging due to its emphasis on DNA degeneration and how it affects the cell.

Day 8 Journal

25 Jul

If someone asked me what AI is, I would say that it is anything that can change its output based on more input. I think AI is capable of basic animal behavior such as operant and classical conditioning and can recognize patterns. I don’t consider realistic appearance or human like modeling to be AI since it involves more polishing than intelligence itself.