Tag: customization
Blog
How to Make a Customizable Google Custom Search Engine Box
Google Custom Search Engine (CSE) is a great service when you need to implement a search functionality to a website and do not spend too much time. However, the default way of implementing it restricting us to have a certain search box design that might not blend well to our website. In this post, I’ll show how I found a workaround solution and actually implement on this very blog.
To see the solution in action, feel free to search anything on the search box at the top of the right sidebar.
Tag: google custom search
Blog
How to Make a Customizable Google Custom Search Engine Box
Google Custom Search Engine (CSE) is a great service when you need to implement a search functionality to a website and do not spend too much time. However, the default way of implementing it restricting us to have a certain search box design that might not blend well to our website. In this post, I’ll show how I found a workaround solution and actually implement on this very blog.
To see the solution in action, feel free to search anything on the search box at the top of the right sidebar.
Tag: google custom search engine
Blog
How to Make a Customizable Google Custom Search Engine Box
Google Custom Search Engine (CSE) is a great service when you need to implement a search functionality to a website and do not spend too much time. However, the default way of implementing it restricting us to have a certain search box design that might not blend well to our website. In this post, I’ll show how I found a workaround solution and actually implement on this very blog.
To see the solution in action, feel free to search anything on the search box at the top of the right sidebar.
Tag: search box
Blog
How to Make a Customizable Google Custom Search Engine Box
Google Custom Search Engine (CSE) is a great service when you need to implement a search functionality to a website and do not spend too much time. However, the default way of implementing it restricting us to have a certain search box design that might not blend well to our website. In this post, I’ll show how I found a workaround solution and actually implement on this very blog.
To see the solution in action, feel free to search anything on the search box at the top of the right sidebar.
Blog
Bootstrap 4 Search Box with Search Icon
Bootstrap 4 is a very handy library to generate quick web user interfaces for web pages and web applications. Search box is a very fundamental UI element if the web page is providing some content and in this post I’ll describe some styles that make a nice text input for search box.
To accomplish this, I’ll make use of the default way of form validations in Bootstrap 3 which was removed in Bootstrap 4 because it doesn’t support font icons anymore.
Tag: bootstrap 4
Blog
Bootstrap 4 Search Box with Search Icon
Bootstrap 4 is a very handy library to generate quick web user interfaces for web pages and web applications. Search box is a very fundamental UI element if the web page is providing some content and in this post I’ll describe some styles that make a nice text input for search box.
To accomplish this, I’ll make use of the default way of form validations in Bootstrap 3 which was removed in Bootstrap 4 because it doesn’t support font icons anymore.
Tag: css
Blog
Bootstrap 4 Search Box with Search Icon
Bootstrap 4 is a very handy library to generate quick web user interfaces for web pages and web applications. Search box is a very fundamental UI element if the web page is providing some content and in this post I’ll describe some styles that make a nice text input for search box.
To accomplish this, I’ll make use of the default way of form validations in Bootstrap 3 which was removed in Bootstrap 4 because it doesn’t support font icons anymore.
Tag: search icon
Blog
Bootstrap 4 Search Box with Search Icon
Bootstrap 4 is a very handy library to generate quick web user interfaces for web pages and web applications. Search box is a very fundamental UI element if the web page is providing some content and in this post I’ll describe some styles that make a nice text input for search box.
To accomplish this, I’ll make use of the default way of form validations in Bootstrap 3 which was removed in Bootstrap 4 because it doesn’t support font icons anymore.
Tag: install
Blog
How to Install Sambamba on Linux
Sambamba is a great utility to work with alignment file formats in bioinformatics such as BAM and CRAM. Follow below steps on any 64-bit Linux machine to install (this guide installs version 0.6.8 go to Sambamba releases page for the most up-to-date version):
Create a softwares directory (optional but recommended)
cd ~/ mkdir softwares cd softwares/ Download the static executable
wget https://github.com/biod/sambamba/releases/download/v0.6.8/sambamba-0.6.8-linux-static.gz Unzip the package and rename the executable unpacked
Blog
Correct Installation and Configuration of pip2 and pip3
You may have to keep both Python version, the old 2 and 3, at the same time due to your projects and they will require corresponding pip installation so you can separately install and maintain packages for both version.
There are multiple ways of installing pip to a system but the version configuration and setting the default version for pip executable can be tricky.
Below is the easiest solution I’ve found.
Blog
How to Install Valgrind on macOS High Sierra
Valgrind is a programming tool for memory debugging, memory leak detection and profiling. Its installation for macOS High Sierra seems problematic and I wanted to write this post to tell the solution that worked for me. I use Homebrew to install it which is the recommended way and the solution also uses it.
So, when you try installing right away, you may get the following error:
1brew install valgrind 2valgrind: This formula either does not compile or function as expected on macOS 3versions newer than Sierra due to an upstream incompatibility.
Blog
How to Install Numpy Python Package on Windows
Numpy (Numerical Python) is a great Python package that you should definitely make use of if you’re doing scientific computing
Installing it on Windows might be difficult if you don’t know how to do it via command line. There are unofficial Windows binaries for Numpy for Windows 32 and 64 bit which make it super easy to install.
Go to the link below and download the one for your system and Python version:http://www.
Blog
Install Ensembl API and BioPerl 1.2.3 on Your System
I’m going to work on a project that requires lots of queries on Ensembl databases so I wanted to install Ensembl API to begin with. Since it’s programmed in Perl, I will be using Perl in this project.
There is a nice tutorial on Ensembl website for API installation. Here I will describe some steps.
1. Download the API and BioPerl
Go to Ensembl FTP ftp://ftp.ensembl.org/pub/ and download “ensembl-api.tar.gz” or click here
Tag: installation
Blog
How to Install Sambamba on Linux
Sambamba is a great utility to work with alignment file formats in bioinformatics such as BAM and CRAM. Follow below steps on any 64-bit Linux machine to install (this guide installs version 0.6.8 go to Sambamba releases page for the most up-to-date version):
Create a softwares directory (optional but recommended)
cd ~/ mkdir softwares cd softwares/ Download the static executable
wget https://github.com/biod/sambamba/releases/download/v0.6.8/sambamba-0.6.8-linux-static.gz Unzip the package and rename the executable unpacked
Tag: linux
Blog
How to Install Sambamba on Linux
Sambamba is a great utility to work with alignment file formats in bioinformatics such as BAM and CRAM. Follow below steps on any 64-bit Linux machine to install (this guide installs version 0.6.8 go to Sambamba releases page for the most up-to-date version):
Create a softwares directory (optional but recommended)
cd ~/ mkdir softwares cd softwares/ Download the static executable
wget https://github.com/biod/sambamba/releases/download/v0.6.8/sambamba-0.6.8-linux-static.gz Unzip the package and rename the executable unpacked
Blog
Passwordless SSH for Mac/Linux
You don’t have to enter the ssh password everytime you make a connection. Use below method to generate a key, copy it to the host you want to connect and connect anytime without entering your password.
Generate a keygen:
1ssh-keygen Copy the key to remote host:
1ssh-copy-id root@linuxconfig.org
Tag: sambamba
Blog
How to Install Sambamba on Linux
Sambamba is a great utility to work with alignment file formats in bioinformatics such as BAM and CRAM. Follow below steps on any 64-bit Linux machine to install (this guide installs version 0.6.8 go to Sambamba releases page for the most up-to-date version):
Create a softwares directory (optional but recommended)
cd ~/ mkdir softwares cd softwares/ Download the static executable
wget https://github.com/biod/sambamba/releases/download/v0.6.8/sambamba-0.6.8-linux-static.gz Unzip the package and rename the executable unpacked
Tag: image compression
Blog
Easy and Free Method to Compress Images on macOS with GUI and Terminal
Image compression is mostly needed if you are short of storage on your devices or if you want to serve your images online and you want to optimize them in a way that we load fast which greatly affects how search engines evaluates your content and how users will enjoy your website.
This is especially important if you are also aiming to support for mobile devices and internet connections that are relatively slow.
Tag: image optimization
Blog
Easy and Free Method to Compress Images on macOS with GUI and Terminal
Image compression is mostly needed if you are short of storage on your devices or if you want to serve your images online and you want to optimize them in a way that we load fast which greatly affects how search engines evaluates your content and how users will enjoy your website.
This is especially important if you are also aiming to support for mobile devices and internet connections that are relatively slow.
Tag: macos
Blog
Easy and Free Method to Compress Images on macOS with GUI and Terminal
Image compression is mostly needed if you are short of storage on your devices or if you want to serve your images online and you want to optimize them in a way that we load fast which greatly affects how search engines evaluates your content and how users will enjoy your website.
This is especially important if you are also aiming to support for mobile devices and internet connections that are relatively slow.
Blog
Memory Leak Testing with Valgrind on macOS using Docker Containers
I had some issues installing Valgrind on macOS High Sierra and [posted some tips to successfully install it to the system]({% post_url 2018-04-28-how-to-install-valgrind-on-macos-high-sierra %}). Although I could install the software this way, it didn’t work correctly after testing with with several real and dummy C++ codes. It was giving me a memory leak error even with an empty code. So, then I decided to use an Ubuntu 16.04 based Docker container to test the code within the container using the Ubuntu version of Valgrind.
Blog
How to Install Valgrind on macOS High Sierra
Valgrind is a programming tool for memory debugging, memory leak detection and profiling. Its installation for macOS High Sierra seems problematic and I wanted to write this post to tell the solution that worked for me. I use Homebrew to install it which is the recommended way and the solution also uses it.
So, when you try installing right away, you may get the following error:
1brew install valgrind 2valgrind: This formula either does not compile or function as expected on macOS 3versions newer than Sierra due to an upstream incompatibility.
Tag: macos mojave
Blog
Easy and Free Method to Compress Images on macOS with GUI and Terminal
Image compression is mostly needed if you are short of storage on your devices or if you want to serve your images online and you want to optimize them in a way that we load fast which greatly affects how search engines evaluates your content and how users will enjoy your website.
This is especially important if you are also aiming to support for mobile devices and internet connections that are relatively slow.
Tag: database
Blog
MongoDB Listing Database Collections/Tables with Number of Records/Rows
Use following script and command to quickly get the number of records/rows in the collections/tables in a database.
mongo-ls.js script:
1var collections = db.getCollectionNames(); 2for (var i = 0; i < collections.length; ++i) { 3 print(collections[i] + ' - ' + db[collections[i]].count() + ' records'); 4} So, copy-paste this script in to text file and save as mongo-ls.js.
Finally, use the following command to query the database. Make sure you change HOSTNAME, DBNAME, USERNAME and PASSWORD with your own.
Blog
How to Generate Database EER Diagrams from SQL Scripts using MySQL Workbench
MySQL Workbench makes it really easy to generate EER diagrams from SQL scripts. Follow below steps to make one for yourself.
Download and install MySQL Workbench for your system.
See below simple SQL commands, later I’ll use them to generate a sample diagram.
1create table country ( 2 id integer primary key, 3 name CHAR(55)); 4 5create table city ( 6 id integer primary key, 7 country_id integer, 8 name CHAR(55), 9 foreign key (country_id) references country(id)); Open MySQL Workbench and create a new model (File -> New Model).
Blog
Get Size of MySQL Databases
Use below query in MySQL command prompt to get a table of databases and their sizes in MB.
SELECT table_schema "DB Name", Round(Sum(data_length + index_length) / 1024 / 1024, 1) "DB Size in MB" FROM information_schema.tables GROUP BY table_schema;
Blog
How to Set Up a MySQL Database for a Mezzanine Project
Install MySQL server and python-mysqldb package:
sudo apt-get install mysql-server sudo apt-get install python-mysqldb Run MySQL:
mysql -u root -p Create a database:
mysql> create database mezzanine_project; Confirm it:
mysql> show databases; Exit:
mysql> exit Configure local_settings.py:
cd path/to/your/mezzanine/projectnano local_settings.py Like following:
1DATABASES = { 2 "default": { 3 "ENGINE": "django.db.backends.mysql", 4 "NAME": "mezzanine_project", 5 "USER": "root", 6 "PASSWORD": "123456", 7 "HOST": "", 8 "PORT": "", 9 } 10 } Note: Replace your password
Blog
Retrieving Data with AJAX using jQuery, PHP and MySQL
Last semester, I took a course from Informatics Institute at METU called “Biological Databases and Data Analysis Tools” where first we learned what is a database and how to do queries on it. Also, the technology behind databases are taught. Then, we learned many biological databases and data analysis tools available. These include gene, protein and pathway databases, tools for creating databases.
As a final project, we were asked to create an online tool that can search a database and get the data and display it on any web browsers.
Tag: mongo
Blog
MongoDB Listing Database Collections/Tables with Number of Records/Rows
Use following script and command to quickly get the number of records/rows in the collections/tables in a database.
mongo-ls.js script:
1var collections = db.getCollectionNames(); 2for (var i = 0; i < collections.length; ++i) { 3 print(collections[i] + ' - ' + db[collections[i]].count() + ' records'); 4} So, copy-paste this script in to text file and save as mongo-ls.js.
Finally, use the following command to query the database. Make sure you change HOSTNAME, DBNAME, USERNAME and PASSWORD with your own.
Tag: mongodb
Blog
MongoDB Listing Database Collections/Tables with Number of Records/Rows
Use following script and command to quickly get the number of records/rows in the collections/tables in a database.
mongo-ls.js script:
1var collections = db.getCollectionNames(); 2for (var i = 0; i < collections.length; ++i) { 3 print(collections[i] + ' - ' + db[collections[i]].count() + ' records'); 4} So, copy-paste this script in to text file and save as mongo-ls.js.
Finally, use the following command to query the database. Make sure you change HOSTNAME, DBNAME, USERNAME and PASSWORD with your own.
Tag: nosql
Blog
MongoDB Listing Database Collections/Tables with Number of Records/Rows
Use following script and command to quickly get the number of records/rows in the collections/tables in a database.
mongo-ls.js script:
1var collections = db.getCollectionNames(); 2for (var i = 0; i < collections.length; ++i) { 3 print(collections[i] + ' - ' + db[collections[i]].count() + ' records'); 4} So, copy-paste this script in to text file and save as mongo-ls.js.
Finally, use the following command to query the database. Make sure you change HOSTNAME, DBNAME, USERNAME and PASSWORD with your own.
Tag: entrez id
Blog
Convert Gene Symbols to Entrez IDs in R
Bioinformatics studies usually includes gene symbols as identifiers (IDs) as they are more recognizable comparing to other IDs such as Entrez IDs. However, certain analyses (tools) may not use gene symbols as there are usually more than one symbol so it is more difficult to implement a method to work with gene symbols. In such cases, you may need to do a conversion which is very common thing to do in bioinformatics.
Tag: gene symbol
Blog
Convert Gene Symbols to Entrez IDs in R
Bioinformatics studies usually includes gene symbols as identifiers (IDs) as they are more recognizable comparing to other IDs such as Entrez IDs. However, certain analyses (tools) may not use gene symbols as there are usually more than one symbol so it is more difficult to implement a method to work with gene symbols. In such cases, you may need to do a conversion which is very common thing to do in bioinformatics.
Tag: id conversion
Blog
Convert Gene Symbols to Entrez IDs in R
Bioinformatics studies usually includes gene symbols as identifiers (IDs) as they are more recognizable comparing to other IDs such as Entrez IDs. However, certain analyses (tools) may not use gene symbols as there are usually more than one symbol so it is more difficult to implement a method to work with gene symbols. In such cases, you may need to do a conversion which is very common thing to do in bioinformatics.
Tag: r
Blog
Convert Gene Symbols to Entrez IDs in R
Bioinformatics studies usually includes gene symbols as identifiers (IDs) as they are more recognizable comparing to other IDs such as Entrez IDs. However, certain analyses (tools) may not use gene symbols as there are usually more than one symbol so it is more difficult to implement a method to work with gene symbols. In such cases, you may need to do a conversion which is very common thing to do in bioinformatics.
Blog
Jupyter Notebook ile R Programlama - R Kernel Kurulumu
Daha önceki bir yazımda [Jupyter’in kurulumundan ve Jupyter Notebook]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %})’tan bahsetmiştim. Jupyter’in kurulumu Jupyter Notebook’a Python kernelini direkt kuruyor ve Python ile programlamayı mümkün kılıyor ancak biyoenformatikte sıkça kullanılacak bir diğer programlama dili olan R programlama için ilgili kerneli ekstra kurmak gerekiyor. Bu yazımda bu kernelin kurulumundan bahsedeceğim.
Öncelikle [Jupyter kurulumu]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %}) ve R kurulumu yapılmış olması gerekiyor.
Daha sonra Terminal’den aşağıdaki komutu kullanarak bir R oturumu başlatın:
Blog
Computing Significance of Overlap between Two Sets using Hypergeometric Test
There are many cases where we have two sets (e.g. under two different conditions) of things such as transcripts, genes or proteins and we want to compute the significance of the overlap between them. Hypergeometric test is very simple and widely used option for such cases.
I’ll use the phyper function in R but you can use the same idea in SciPy (Python).
Let’s say you have from 200 genes (A);
Blog
Mann Whitney U Test (Wilcoxon Rank-Sum Test) Javascript Implementation
Currently Javascript is really poor in statistical methods compared to Python (SciPy) and R. There are several efforts to fill this gap, most notably from jStat. However, still many functions, distributions and tests are missing in this library. In one of my projects, I had to implement a Javascript version of Mann Whitney U test (or also called Wilcoxon rank-sum test). Here, I’m giving a link to its source code and describing how it works.
Blog
MiClip 1.3 Installation
MiClip is a CLIP-seq data peak calling algorithm implemented in R but currently it doesn’t show up in the CRAN but you can obtain it from the archive and install from the source or tar.gz file.
Download the tar.gz file:
wget https://cran.r-project.org/src/contrib/Archive/MiClip/MiClip_1.3.tar.gz Start R:
R Install dependencies:
1install.packages("moments") 2install.packages("VGAM") Finally install MiClip 1.3:
1install.packages("MiClip_1.3.tar.gz", repos = NULL, type="source") Then you can test it by loading the package and viewing its help file.
Blog
How to Get Path to or Directory of Current Script in R
Use following code to get the path to or directory of current (running) script in R:
1scr_dir <- dirname(sys.frame(1)$ofile) 2scr_path <- paste(scr_dir, "script.R", sep="/") Taken from SO
Blog
How to Get (or Load) NCBI GEO Microarray Data into R using GEOquery Package from Bioconductor
R, especially with lots of Bioconductor packages, provides nice tools to load, manage and analyze microarray data. If you are trying to load NCBI GEO data into R, use GEOquery package. Here, I’ll describe how to start with it and probably in my future posts I’ll mention more.
Installation
1source("http://bioconductor.org/biocLite.R") 2biocLite("GEOquery") Usage
1library(GEOquery) 2gds <- getGEO("GDS5072") or
1library(GEOquery) 2gds <- getGEO(filename="path/to/GDS5072.soft.gz") getGEO function return a complex class type GDS object which contains the complete dataset.
Blog
Plotting Expression Curves for Experimental Data
As I can plot expression curves for in silico data. I moved on experimental data which is more complex and larger. This data is the result of RPPA experiments on different breast cancer cell lines and it includes protein abundance measurements for about 45 phophoproteins. These phosphoproteins are treated with different inhibitors and stimuli and by comparing their expressions, I will try to infer relations between them.
Before moving on inferring part, I want to have a script that can plot the graphs so that I can see particular results for specific cases.
Blog
Experimental Data Optimization for Network Inference
As I mentioned in my previous post, experimental data from the challenge has missing data values that create problems during analyses. To solve it, first thing I did was to optimize data, which includes detecting missing conditions and putting NAs for data values and sorting them if necessary.
I wrote two functions in the script. First one ranks the data according to the fashion and sorts it based on these ranks.
Blog
Some String Functions in R, String Manipulation in R
I have programmed with Perl, Python, and PHP before, and string manipulation was more direct and easier in them than in R. But still there are useful functions for string manipulation in R. I’m not an expert in R but I’ve been dealing with it for a while and I’ve learned some good functions for this purpose.
Concatenate strings
Concatenation is done with paste function. It gets concatenated strings as arguments separated bu comma and also separator character(s).
Blog
Network Inference DREAM Breast Cancer Challenge
The inference of causal edges are described as the change on a node seen after the intervention of another node. If the curves obtained over time overlap (under intervention or no intervention), then there is no relation. Otherwise, we can draw an edge between those nodes and according to the level, up or down, the edge will be activating or inhibiting. These causal edges are context-specific so in different cell line data, we may have different relations.
Tag: r programming
Blog
Convert Gene Symbols to Entrez IDs in R
Bioinformatics studies usually includes gene symbols as identifiers (IDs) as they are more recognizable comparing to other IDs such as Entrez IDs. However, certain analyses (tools) may not use gene symbols as there are usually more than one symbol so it is more difficult to implement a method to work with gene symbols. In such cases, you may need to do a conversion which is very common thing to do in bioinformatics.
Blog
Computing Significance of Overlap between Two Sets using Hypergeometric Test
There are many cases where we have two sets (e.g. under two different conditions) of things such as transcripts, genes or proteins and we want to compute the significance of the overlap between them. Hypergeometric test is very simple and widely used option for such cases.
I’ll use the phyper function in R but you can use the same idea in SciPy (Python).
Let’s say you have from 200 genes (A);
Blog
Mann Whitney U Test (Wilcoxon Rank-Sum Test) Javascript Implementation
Currently Javascript is really poor in statistical methods compared to Python (SciPy) and R. There are several efforts to fill this gap, most notably from jStat. However, still many functions, distributions and tests are missing in this library. In one of my projects, I had to implement a Javascript version of Mann Whitney U test (or also called Wilcoxon rank-sum test). Here, I’m giving a link to its source code and describing how it works.
Blog
MiClip 1.3 Installation
MiClip is a CLIP-seq data peak calling algorithm implemented in R but currently it doesn’t show up in the CRAN but you can obtain it from the archive and install from the source or tar.gz file.
Download the tar.gz file:
wget https://cran.r-project.org/src/contrib/Archive/MiClip/MiClip_1.3.tar.gz Start R:
R Install dependencies:
1install.packages("moments") 2install.packages("VGAM") Finally install MiClip 1.3:
1install.packages("MiClip_1.3.tar.gz", repos = NULL, type="source") Then you can test it by loading the package and viewing its help file.
Blog
How to Get Path to or Directory of Current Script in R
Use following code to get the path to or directory of current (running) script in R:
1scr_dir <- dirname(sys.frame(1)$ofile) 2scr_path <- paste(scr_dir, "script.R", sep="/") Taken from SO
Blog
How to Get (or Load) NCBI GEO Microarray Data into R using GEOquery Package from Bioconductor
R, especially with lots of Bioconductor packages, provides nice tools to load, manage and analyze microarray data. If you are trying to load NCBI GEO data into R, use GEOquery package. Here, I’ll describe how to start with it and probably in my future posts I’ll mention more.
Installation
1source("http://bioconductor.org/biocLite.R") 2biocLite("GEOquery") Usage
1library(GEOquery) 2gds <- getGEO("GDS5072") or
1library(GEOquery) 2gds <- getGEO(filename="path/to/GDS5072.soft.gz") getGEO function return a complex class type GDS object which contains the complete dataset.
Blog
Plotting Expression Curves for Experimental Data
As I can plot expression curves for in silico data. I moved on experimental data which is more complex and larger. This data is the result of RPPA experiments on different breast cancer cell lines and it includes protein abundance measurements for about 45 phophoproteins. These phosphoproteins are treated with different inhibitors and stimuli and by comparing their expressions, I will try to infer relations between them.
Before moving on inferring part, I want to have a script that can plot the graphs so that I can see particular results for specific cases.
Blog
Experimental Data Optimization for Network Inference
As I mentioned in my previous post, experimental data from the challenge has missing data values that create problems during analyses. To solve it, first thing I did was to optimize data, which includes detecting missing conditions and putting NAs for data values and sorting them if necessary.
I wrote two functions in the script. First one ranks the data according to the fashion and sorts it based on these ranks.
Blog
Some String Functions in R, String Manipulation in R
I have programmed with Perl, Python, and PHP before, and string manipulation was more direct and easier in them than in R. But still there are useful functions for string manipulation in R. I’m not an expert in R but I’ve been dealing with it for a while and I’ve learned some good functions for this purpose.
Concatenate strings
Concatenation is done with paste function. It gets concatenated strings as arguments separated bu comma and also separator character(s).
Blog
Network Inference DREAM Breast Cancer Challenge
The inference of causal edges are described as the change on a node seen after the intervention of another node. If the curves obtained over time overlap (under intervention or no intervention), then there is no relation. Otherwise, we can draw an edge between those nodes and according to the level, up or down, the edge will be activating or inhibiting. These causal edges are context-specific so in different cell line data, we may have different relations.
Tag: pip
Blog
Correct Installation and Configuration of pip2 and pip3
You may have to keep both Python version, the old 2 and 3, at the same time due to your projects and they will require corresponding pip installation so you can separately install and maintain packages for both version.
There are multiple ways of installing pip to a system but the version configuration and setting the default version for pip executable can be tricky.
Below is the easiest solution I’ve found.
Tag: python 2
Blog
Correct Installation and Configuration of pip2 and pip3
You may have to keep both Python version, the old 2 and 3, at the same time due to your projects and they will require corresponding pip installation so you can separately install and maintain packages for both version.
There are multiple ways of installing pip to a system but the version configuration and setting the default version for pip executable can be tricky.
Below is the easiest solution I’ve found.
Tag: python 3
Blog
Correct Installation and Configuration of pip2 and pip3
You may have to keep both Python version, the old 2 and 3, at the same time due to your projects and they will require corresponding pip installation so you can separately install and maintain packages for both version.
There are multiple ways of installing pip to a system but the version configuration and setting the default version for pip executable can be tricky.
Below is the easiest solution I’ve found.
Tag: python package manager
Blog
Correct Installation and Configuration of pip2 and pip3
You may have to keep both Python version, the old 2 and 3, at the same time due to your projects and they will require corresponding pip installation so you can separately install and maintain packages for both version.
There are multiple ways of installing pip to a system but the version configuration and setting the default version for pip executable can be tricky.
Below is the easiest solution I’ve found.
Tag: chrome
Blog
Capture Full Size Screenshot on Chrome without Extension
Chrome’s new Developer Tools has a way to capture high quality full size screenshot of the page so you don’t have to have an extension for it anymore!
Update for latest Chrome versions: Chrome DevTools was slightly changed so here are the new steps (tested in Version 71.0.3578.98 (Official Build) (64-bit) on macOS).
Open the website that you want to capture Use Ctrl + Shift + J shortcut on Windows/Linux or Cmd + Opt + J on Mac to open Developer Tools.
Tag: full page screenshot
Blog
Capture Full Size Screenshot on Chrome without Extension
Chrome’s new Developer Tools has a way to capture high quality full size screenshot of the page so you don’t have to have an extension for it anymore!
Update for latest Chrome versions: Chrome DevTools was slightly changed so here are the new steps (tested in Version 71.0.3578.98 (Official Build) (64-bit) on macOS).
Open the website that you want to capture Use Ctrl + Shift + J shortcut on Windows/Linux or Cmd + Opt + J on Mac to open Developer Tools.
Tag: full size
Blog
Capture Full Size Screenshot on Chrome without Extension
Chrome’s new Developer Tools has a way to capture high quality full size screenshot of the page so you don’t have to have an extension for it anymore!
Update for latest Chrome versions: Chrome DevTools was slightly changed so here are the new steps (tested in Version 71.0.3578.98 (Official Build) (64-bit) on macOS).
Open the website that you want to capture Use Ctrl + Shift + J shortcut on Windows/Linux or Cmd + Opt + J on Mac to open Developer Tools.
Tag: screenshot
Blog
Capture Full Size Screenshot on Chrome without Extension
Chrome’s new Developer Tools has a way to capture high quality full size screenshot of the page so you don’t have to have an extension for it anymore!
Update for latest Chrome versions: Chrome DevTools was slightly changed so here are the new steps (tested in Version 71.0.3578.98 (Official Build) (64-bit) on macOS).
Open the website that you want to capture Use Ctrl + Shift + J shortcut on Windows/Linux or Cmd + Opt + J on Mac to open Developer Tools.
Tag: docker
Blog
Memory Leak Testing with Valgrind on macOS using Docker Containers
I had some issues installing Valgrind on macOS High Sierra and [posted some tips to successfully install it to the system]({% post_url 2018-04-28-how-to-install-valgrind-on-macos-high-sierra %}). Although I could install the software this way, it didn’t work correctly after testing with with several real and dummy C++ codes. It was giving me a memory leak error even with an empty code. So, then I decided to use an Ubuntu 16.04 based Docker container to test the code within the container using the Ubuntu version of Valgrind.
Tag: memory leak
Blog
Memory Leak Testing with Valgrind on macOS using Docker Containers
I had some issues installing Valgrind on macOS High Sierra and [posted some tips to successfully install it to the system]({% post_url 2018-04-28-how-to-install-valgrind-on-macos-high-sierra %}). Although I could install the software this way, it didn’t work correctly after testing with with several real and dummy C++ codes. It was giving me a memory leak error even with an empty code. So, then I decided to use an Ubuntu 16.04 based Docker container to test the code within the container using the Ubuntu version of Valgrind.
Tag: memory management
Blog
Memory Leak Testing with Valgrind on macOS using Docker Containers
I had some issues installing Valgrind on macOS High Sierra and [posted some tips to successfully install it to the system]({% post_url 2018-04-28-how-to-install-valgrind-on-macos-high-sierra %}). Although I could install the software this way, it didn’t work correctly after testing with with several real and dummy C++ codes. It was giving me a memory leak error even with an empty code. So, then I decided to use an Ubuntu 16.04 based Docker container to test the code within the container using the Ubuntu version of Valgrind.
Tag: valgrind
Blog
Memory Leak Testing with Valgrind on macOS using Docker Containers
I had some issues installing Valgrind on macOS High Sierra and [posted some tips to successfully install it to the system]({% post_url 2018-04-28-how-to-install-valgrind-on-macos-high-sierra %}). Although I could install the software this way, it didn’t work correctly after testing with with several real and dummy C++ codes. It was giving me a memory leak error even with an empty code. So, then I decided to use an Ubuntu 16.04 based Docker container to test the code within the container using the Ubuntu version of Valgrind.
Blog
How to Install Valgrind on macOS High Sierra
Valgrind is a programming tool for memory debugging, memory leak detection and profiling. Its installation for macOS High Sierra seems problematic and I wanted to write this post to tell the solution that worked for me. I use Homebrew to install it which is the recommended way and the solution also uses it.
So, when you try installing right away, you may get the following error:
1brew install valgrind 2valgrind: This formula either does not compile or function as expected on macOS 3versions newer than Sierra due to an upstream incompatibility.
Tag: fasta
Blog
How to Download hg38/GRCh38 FASTA Human Reference Genome
hg38/GRCh38 is the latest human reference genome as of today which was released December, 2013. There are multiple sources for downloading it and also it comes in different versions.
The most well-known databases to use for downloading the human reference genomes are UCSC Genome Browser, Ensembl and NCBI. The naming convention hg38 is used by UCSC Genome Browser, while Ensembl and NCBI use GRCh38 to refer to the latest human reference genome.
Blog
How to Convert PED to FASTA
You may need the conversion of PED files to FASTA format in your studies for further analyses. Use below script for this purpose.
PED to FASTA converter on GitHub
Gets first 6 columns of each line as header line and the rest as the sequence replacing 0s with Ns and organizes it into a FASTA file.
Note 0s are for missing nucleotides defined by default in PLINK
How to run:
Blog
Download Human Reference Genome (HG19 - GRCh37)
Many variation calling tools and many other methods in bioinformatics require a reference genome as an input so may need to download human reference genome or sequences. There are several sources that freely and publicly provide the entire human genome and I’ll describe how to download complete human genome from University of California, Santa Cruz (UCSC) webpage.
Index to the gzip-compressed FASTA files of human chromosomes can be found here at the UCSC webpage.
Blog
How to Get Transcripts (also Exons & Introns) of a Gene using Ensembl API
As a part of my project, I need to obtain exons and introns of certain genes. These genes are actually human genes that are determined for a specific reason that I will describe later when I explain my project. But for now, I want to share the way to obtain this information using (Perl) Ensembl API. Note that Ensembl has started a beautiful way (Ensembl REST API) of getting data but it is beta and it doesn’t provide exons / introns information.
Blog
Birden Fazla Dizi Dosyalarindan MegaBLAST'i Calistirmak
Asagidaki scripti, pipeline’in MegaBLAST aramasini daha hizli yapabilmek icin dusundugumuz bir teknige uygun olabilmesi icin yazdim. Yaptigi sey, her okuma icin olusturulmus ve formatlanmis dizi dosyalarini kullanarak veritabanlarinda belirtilen baslangic noktasi ve okuma sayisi ile arama yapmak.
1#!user/local/bin/perl 2 3$database = $ARGV[0]; 4$dir = $ARGV[1]; #directory for sequences 5$sp = $ARGV[2]; #starting point 6$n = $ARGV[3] + $sp; 7 8while (1) { 9 system("blastplus -programname=megablast $dir/read_$sp.seq $database -OUTFILE=read_$sp.megablast -nobatch -d"); 10 $sp++; 11 last if ($sp == $n); 12} Burada her sey gercekten cok basit bir programlama ile isliyor.
Blog
Tek FASTA Dosyasindan MegaBLAST'i Calistirmak - Duzenli Ifadeler
Asagida MegaBLAST’i FASTA dosyasi okuyarak calistirmak ve sonuclari bir dizinde toplayabilmek amaciyla yazdigim Perl scripti ve onun aciklamasi var. Bu script tasarlamakta oldugum pipeline’in onemli bir parcasi. Bu script ilk yazdigim olan ve sadece bir FASTA dosyasi uzerinden tum okumalara ulasabilen script.
1#!user/local/bin/perl 2$database = $ARGV[0]; 3$fasta = $ARGV[1]; #input file 4$sp = $ARGV[2]; #starting point 5$n = $ARGV[3] + $sp; 6 7if(!defined($n)){$n=12;} #set default number 8 9open FASTA, $fasta or die $!
Blog
Bir MegaBLAST Ciktisi Icerigi - RefSeq Veritabani
Asagida, deneme FASTA dosyasini refseq_genomic veritabaninda arayarak elde ettigim dosyadan, bir hitin ayrintilarini goruyoruz.
>>>>refseq_genomic_complete3: AC_000033_0310 Continuation (311 of 1357) of AC_000033 from base 31000001 (AC_000033 Mus musculus strain mixed chromosome 11, alternate assembly Mm_Celera, whole genome shotgun sequence. 2/2012) Length = 110000 Score = 115 bits (58), Expect = 4e-22 Identities = 74/79 (93%), Gaps = 2/79 (2%) Strand = Plus / Minus Query: 1 ctctctctgtct-tctctctctctctgtctctctctctttctctctcttctctctctctc 59 |||||||||||| ||| ||||||||| ||||||||||| ||||||||||||||||||||| Sbjct: 89773 ctctctctgtctgtctttctctctctctctctctctctctctctctcttctctctctctc 89714 Query: 60 tttctctctgccctctctc 78 ||||||||| ||||||||| Sbjct: 89713 tttctctct-ccctctctc 89696 Ayrintilarda, ilk olarak >>>> karakterleriyle hit ile ilgili baslik bilgisi veriyor.
Blog
MegaBLAST Aramasini Hizlandirma
Son zamanlarda sadece farkli veritabanlarinda, MegaBLAST’i en cabuk ve etkili bir sekilde calistirmanin yolunu ariyorum ve FASTA dosyasi olusturma asamasinda, gercekten cokca ise yarayan bir yontem danismanim tarafindan geldi.
Daha once tum dizilerin bulundugu tek bir FASTA dosyasindan arama yapiyordum ve bu zaman kaybina yol aciyordu. Her ne kadar dosya bir sefer acilsa da her seferinde dosya icinde satirlara gidip onu okuman, zaman alan bir islem. Bunu, dosyadaki her okumayi, ayri bir FASTA dosyasi haline getirerek cozduk.
Blog
FASTQ'dan FASTA'ya Donusturme Perl Scripti
FASTQ ve FASTA formatlari aslinda ayni bilgiyi iceren ancak birinde sadece herbir dizi icin iki satir daha az bilginin bulundugu dosya formatlari. Projemde onemli olan diger bir farklari ise FASTA formatinin direkt olarak MegaBLAST arama yapilabilmesi. Iste bu yuzden, genetik dizilim yapan makinelerin olusturdugu FASTQ formatini FASTA’ya cevirmem gerekiyor. Ve bu script pipeline’in ilk adimi.
Aslinda deneme amacli aldigim genetk dizilimin, bana bunu ulastiran tarafindan eslestirmesinin yapilmadigi icin, bir on adim olarak bu eslestirmeyi yapmistim.
Blog
SAM Dosyası - BAM Dosyası - samtools
Aslında programlamam gereken pipeline direkt olarak eşleşmeyen okumalar üzerinden analizler yapacak. Ancak böyle bir veri bulamadiığım için, elimdeki tek veri eşleşen ve eşleşmeyen okumaları içerdiği için önce eşleşenlerden kurtulmam gerekti.
Bunu daha önce de belirttiğim gibi bwa eşleştiricisi (aligner - mapper) ile yapıyorum. bwa bir dizi işlemden sonra SAM dosyası oluşturuyor ancak benim FASTQ dosyasına ihtiyacım var. Bunun için SAM dosyasını samtools1 ile benzer bir format olan BAM dosyasına çevirip, daha sonra da bam2fastq2 aracı ile FASTQ dosyamı elde edeceğim.
Blog
MegaBLAST - Dizilerdeki Benzerlikleri Bulma Aracı
MegaBLAST, HUSAR paketinde bulunan, BLAST (Basic Local Alignment Search Tool) paketinin bir parçası. Ayrıca BLASTN’in bir değişik türü. MegaBLAST uzun dizileri BLASTN’den daha etkili bir şekilde işliyor ve hem de çok daha hızlı işlem yapiyor ancak daha az duyarlı. Bu yüzden benzer dizileri geniş veri tabanlarında aramaya çok uygun bir araç.
Yazacağım program çoklu dizilim barındıran FASTA dosyasını alacak ve megablast komutunu çalıştıracak. Daha sonra da her okuma için bir .
Blog
Kontaminant (Kirletici) Analizi Projesi
Başlangıç olarak, araçlara, programlama diline, kısacası biyoenformatiğe alışabilmem için bana verilen bu ufak projeyi ayrıntılı olarak anlatacağım.
Biliyoruz ki, laboratuvar çalışmalarımızda ne kadar önlemeye çalışsak da kontaminant riski hep bulunuyor. Bunu ne kadar aza indirsek o kadar iyi, ki daha sonra bunun miktarını bulup, bunun üzerinden sonucumuzun bir başka değerlendirmesini de yapabiliriz. İşte bunu bulmak için bir yöntem, DNA analizi. Çalıştığınız örneğinizin DNA’sı dizileniyor ve bu DNA çeşitli programlarla analiz edilip, kirleten organizmaları DNA’larından ortaya çıkarabiliyoruz
Blog
FASTQ Formatı - FASTQ Dosyası
Bugün programı oluştururken kullanacağım “test” dizilimini aldım. İki adet FASTQ dosyasından oluşuyor, her biri sıkıştırılmış ama buna rağmen boyutları 6 GB civarı. Ben elbette çok zaman kaybetmek istemediğim için bu dosyalardan birinin sadece bir kısmını kullanacağım.
Amacım, bu FASTQ dosyalarındaki eşleşebilen okumaları BWA aracı ile bularak, daha sonra onları çıkarmak. Ve kalan eşleşemeyen okumaları MegaBLAST aracının anlayabileceği bir dilde (FASTA formatında) kaydetmek.
Bu arada tüm projeyi bir Unix bilgisayarda hazırladığım için birçok komut öğreniyorum, daha sonra bunları ayrıca yazmaya çalışacağım.
Tag: grch38
Blog
How to Download hg38/GRCh38 FASTA Human Reference Genome
hg38/GRCh38 is the latest human reference genome as of today which was released December, 2013. There are multiple sources for downloading it and also it comes in different versions.
The most well-known databases to use for downloading the human reference genomes are UCSC Genome Browser, Ensembl and NCBI. The naming convention hg38 is used by UCSC Genome Browser, while Ensembl and NCBI use GRCh38 to refer to the latest human reference genome.
Tag: hg38
Blog
How to Download hg38/GRCh38 FASTA Human Reference Genome
hg38/GRCh38 is the latest human reference genome as of today which was released December, 2013. There are multiple sources for downloading it and also it comes in different versions.
The most well-known databases to use for downloading the human reference genomes are UCSC Genome Browser, Ensembl and NCBI. The naming convention hg38 is used by UCSC Genome Browser, while Ensembl and NCBI use GRCh38 to refer to the latest human reference genome.
Tag: human
Blog
How to Download hg38/GRCh38 FASTA Human Reference Genome
hg38/GRCh38 is the latest human reference genome as of today which was released December, 2013. There are multiple sources for downloading it and also it comes in different versions.
The most well-known databases to use for downloading the human reference genomes are UCSC Genome Browser, Ensembl and NCBI. The naming convention hg38 is used by UCSC Genome Browser, while Ensembl and NCBI use GRCh38 to refer to the latest human reference genome.
Blog
Data Preprocessing II for Salmon Project
So in our Multi-dimensional Modeling and Reconstruction of Signaling Networks in Salmonella-infected Human Cells project, we have several methods to construct the networks so the data is still needed to be preprocessed so that it can be ready to be analyzed with these methods.
One method needed to have a matrix first row as protein name and time series (2 min, 5 min, 10 min, 20 min), and the values of the proteins in each time series were to be 1 or 0 according to variance, significance and the size of fold change.
Blog
Multi-dimensional Modeling and Reconstruction of Signaling Networks in Salmonella-infected Human Cells
In this study, we’re going to use a phosphorylation data from a research paper on phosphoproteomic analysis of related cells.
The idea is to use and compare existing methods and develop these methods to be able to better understand the nature of signaling events in these cells and to find key proteins that might be targets for disease diagnosis, prevention and treatment.
This study will be submitted as a research paper so I’m not going to publish any results here for now but I’ll mention the struggles I have and solutions I try to solve them.
Blog
Download Human Reference Genome (HG19 - GRCh37)
Many variation calling tools and many other methods in bioinformatics require a reference genome as an input so may need to download human reference genome or sequences. There are several sources that freely and publicly provide the entire human genome and I’ll describe how to download complete human genome from University of California, Santa Cruz (UCSC) webpage.
Index to the gzip-compressed FASTA files of human chromosomes can be found here at the UCSC webpage.
Blog
Super Long Introns of Euarchontoglires
There was another weird result I got about my exon/intron boundaries analysis research. To less diverse species’ genes, intron lengths are shown to increase. However, according to my findings, at a point of Euarchontoglires or Supraprimates, this increase is very sharp and seems unexpected. So, I looked at exon/intron length each gene in each taxonomic rank and try to see what makes Euarchontoglires genes with that long introns.
As you see in the graph above, Euarchontoglires introns are very long compared to the rest.
Tag: reference genome
Blog
How to Download hg38/GRCh38 FASTA Human Reference Genome
hg38/GRCh38 is the latest human reference genome as of today which was released December, 2013. There are multiple sources for downloading it and also it comes in different versions.
The most well-known databases to use for downloading the human reference genomes are UCSC Genome Browser, Ensembl and NCBI. The naming convention hg38 is used by UCSC Genome Browser, while Ensembl and NCBI use GRCh38 to refer to the latest human reference genome.
Tag: brew
Blog
How to Install Valgrind on macOS High Sierra
Valgrind is a programming tool for memory debugging, memory leak detection and profiling. Its installation for macOS High Sierra seems problematic and I wanted to write this post to tell the solution that worked for me. I use Homebrew to install it which is the recommended way and the solution also uses it.
So, when you try installing right away, you may get the following error:
1brew install valgrind 2valgrind: This formula either does not compile or function as expected on macOS 3versions newer than Sierra due to an upstream incompatibility.
Tag: high sierra
Blog
How to Install Valgrind on macOS High Sierra
Valgrind is a programming tool for memory debugging, memory leak detection and profiling. Its installation for macOS High Sierra seems problematic and I wanted to write this post to tell the solution that worked for me. I use Homebrew to install it which is the recommended way and the solution also uses it.
So, when you try installing right away, you may get the following error:
1brew install valgrind 2valgrind: This formula either does not compile or function as expected on macOS 3versions newer than Sierra due to an upstream incompatibility.
Tag: corrplot
Blog
Jupyter Notebook ile R Programlama - R Kernel Kurulumu
Daha önceki bir yazımda [Jupyter’in kurulumundan ve Jupyter Notebook]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %})’tan bahsetmiştim. Jupyter’in kurulumu Jupyter Notebook’a Python kernelini direkt kuruyor ve Python ile programlamayı mümkün kılıyor ancak biyoenformatikte sıkça kullanılacak bir diğer programlama dili olan R programlama için ilgili kerneli ekstra kurmak gerekiyor. Bu yazımda bu kernelin kurulumundan bahsedeceğim.
Öncelikle [Jupyter kurulumu]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %}) ve R kurulumu yapılmış olması gerekiyor.
Daha sonra Terminal’den aşağıdaki komutu kullanarak bir R oturumu başlatın:
Tag: irkernel
Blog
Jupyter Notebook ile R Programlama - R Kernel Kurulumu
Daha önceki bir yazımda [Jupyter’in kurulumundan ve Jupyter Notebook]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %})’tan bahsetmiştim. Jupyter’in kurulumu Jupyter Notebook’a Python kernelini direkt kuruyor ve Python ile programlamayı mümkün kılıyor ancak biyoenformatikte sıkça kullanılacak bir diğer programlama dili olan R programlama için ilgili kerneli ekstra kurmak gerekiyor. Bu yazımda bu kernelin kurulumundan bahsedeceğim.
Öncelikle [Jupyter kurulumu]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %}) ve R kurulumu yapılmış olması gerekiyor.
Daha sonra Terminal’den aşağıdaki komutu kullanarak bir R oturumu başlatın:
Tag: jupyter
Blog
Jupyter Notebook ile R Programlama - R Kernel Kurulumu
Daha önceki bir yazımda [Jupyter’in kurulumundan ve Jupyter Notebook]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %})’tan bahsetmiştim. Jupyter’in kurulumu Jupyter Notebook’a Python kernelini direkt kuruyor ve Python ile programlamayı mümkün kılıyor ancak biyoenformatikte sıkça kullanılacak bir diğer programlama dili olan R programlama için ilgili kerneli ekstra kurmak gerekiyor. Bu yazımda bu kernelin kurulumundan bahsedeceğim.
Öncelikle [Jupyter kurulumu]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %}) ve R kurulumu yapılmış olması gerekiyor.
Daha sonra Terminal’den aşağıdaki komutu kullanarak bir R oturumu başlatın:
Blog
Jupyter / Python Nedir, Nasıl Kurulur?
Jupyter çeşitli programlama dilleri için etkileşimli bir ortam sağlayan yazılımdır. Orijinal olarak IPython (interactive python) adıyla, Python programlama dili için geliştirildi ancak daha sonra kurucuları Jupyter projesini başlatıp IPython’ın birçok tarafını Jupyter’e kaydırdı. IPython, sadece Jupyter’in kerneli olarak devam ediyor.
Jupyter’in özellikleri;
Etkileşimli bir shell sunması, Komut İstemi/Terminal’den jupyter console komutu ile başlatılır ve orijinal Python shell’ine göre otomatik tamamlama gibi kullanıcı dostu özellikleri barındırır. Tarayıcı tabanlı defter (notebook) sunması, Komut İstemi/Terminal’den jupyter notebook komutu ile başlatılır, açılan tarayıcı penceresinden yeni defter oluşturularak çeşitli programlama dillerinde kodlar yazılabilir ve bu kodlar çalıştırılarak çıktıları (metin, grafik, vs) etkileşimli olarak direkt tarayıcıda görüntülenebilir.
Tag: korelasyon
Blog
Jupyter Notebook ile R Programlama - R Kernel Kurulumu
Daha önceki bir yazımda [Jupyter’in kurulumundan ve Jupyter Notebook]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %})’tan bahsetmiştim. Jupyter’in kurulumu Jupyter Notebook’a Python kernelini direkt kuruyor ve Python ile programlamayı mümkün kılıyor ancak biyoenformatikte sıkça kullanılacak bir diğer programlama dili olan R programlama için ilgili kerneli ekstra kurmak gerekiyor. Bu yazımda bu kernelin kurulumundan bahsedeceğim.
Öncelikle [Jupyter kurulumu]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %}) ve R kurulumu yapılmış olması gerekiyor.
Daha sonra Terminal’den aşağıdaki komutu kullanarak bir R oturumu başlatın:
Tag: korelasyon analizi
Blog
Jupyter Notebook ile R Programlama - R Kernel Kurulumu
Daha önceki bir yazımda [Jupyter’in kurulumundan ve Jupyter Notebook]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %})’tan bahsetmiştim. Jupyter’in kurulumu Jupyter Notebook’a Python kernelini direkt kuruyor ve Python ile programlamayı mümkün kılıyor ancak biyoenformatikte sıkça kullanılacak bir diğer programlama dili olan R programlama için ilgili kerneli ekstra kurmak gerekiyor. Bu yazımda bu kernelin kurulumundan bahsedeceğim.
Öncelikle [Jupyter kurulumu]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %}) ve R kurulumu yapılmış olması gerekiyor.
Daha sonra Terminal’den aşağıdaki komutu kullanarak bir R oturumu başlatın:
Tag: korelasyon grafiği
Blog
Jupyter Notebook ile R Programlama - R Kernel Kurulumu
Daha önceki bir yazımda [Jupyter’in kurulumundan ve Jupyter Notebook]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %})’tan bahsetmiştim. Jupyter’in kurulumu Jupyter Notebook’a Python kernelini direkt kuruyor ve Python ile programlamayı mümkün kılıyor ancak biyoenformatikte sıkça kullanılacak bir diğer programlama dili olan R programlama için ilgili kerneli ekstra kurmak gerekiyor. Bu yazımda bu kernelin kurulumundan bahsedeceğim.
Öncelikle [Jupyter kurulumu]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %}) ve R kurulumu yapılmış olması gerekiyor.
Daha sonra Terminal’den aşağıdaki komutu kullanarak bir R oturumu başlatın:
Tag: kurulum
Blog
Jupyter Notebook ile R Programlama - R Kernel Kurulumu
Daha önceki bir yazımda [Jupyter’in kurulumundan ve Jupyter Notebook]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %})’tan bahsetmiştim. Jupyter’in kurulumu Jupyter Notebook’a Python kernelini direkt kuruyor ve Python ile programlamayı mümkün kılıyor ancak biyoenformatikte sıkça kullanılacak bir diğer programlama dili olan R programlama için ilgili kerneli ekstra kurmak gerekiyor. Bu yazımda bu kernelin kurulumundan bahsedeceğim.
Öncelikle [Jupyter kurulumu]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %}) ve R kurulumu yapılmış olması gerekiyor.
Daha sonra Terminal’den aşağıdaki komutu kullanarak bir R oturumu başlatın:
Blog
Jupyter / Python Nedir, Nasıl Kurulur?
Jupyter çeşitli programlama dilleri için etkileşimli bir ortam sağlayan yazılımdır. Orijinal olarak IPython (interactive python) adıyla, Python programlama dili için geliştirildi ancak daha sonra kurucuları Jupyter projesini başlatıp IPython’ın birçok tarafını Jupyter’e kaydırdı. IPython, sadece Jupyter’in kerneli olarak devam ediyor.
Jupyter’in özellikleri;
Etkileşimli bir shell sunması, Komut İstemi/Terminal’den jupyter console komutu ile başlatılır ve orijinal Python shell’ine göre otomatik tamamlama gibi kullanıcı dostu özellikleri barındırır. Tarayıcı tabanlı defter (notebook) sunması, Komut İstemi/Terminal’den jupyter notebook komutu ile başlatılır, açılan tarayıcı penceresinden yeni defter oluşturularak çeşitli programlama dillerinde kodlar yazılabilir ve bu kodlar çalıştırılarak çıktıları (metin, grafik, vs) etkileşimli olarak direkt tarayıcıda görüntülenebilir.
Tag: notebook
Blog
Jupyter Notebook ile R Programlama - R Kernel Kurulumu
Daha önceki bir yazımda [Jupyter’in kurulumundan ve Jupyter Notebook]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %})’tan bahsetmiştim. Jupyter’in kurulumu Jupyter Notebook’a Python kernelini direkt kuruyor ve Python ile programlamayı mümkün kılıyor ancak biyoenformatikte sıkça kullanılacak bir diğer programlama dili olan R programlama için ilgili kerneli ekstra kurmak gerekiyor. Bu yazımda bu kernelin kurulumundan bahsedeceğim.
Öncelikle [Jupyter kurulumu]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %}) ve R kurulumu yapılmış olması gerekiyor.
Daha sonra Terminal’den aşağıdaki komutu kullanarak bir R oturumu başlatın:
Blog
Jupyter / Python Nedir, Nasıl Kurulur?
Jupyter çeşitli programlama dilleri için etkileşimli bir ortam sağlayan yazılımdır. Orijinal olarak IPython (interactive python) adıyla, Python programlama dili için geliştirildi ancak daha sonra kurucuları Jupyter projesini başlatıp IPython’ın birçok tarafını Jupyter’e kaydırdı. IPython, sadece Jupyter’in kerneli olarak devam ediyor.
Jupyter’in özellikleri;
Etkileşimli bir shell sunması, Komut İstemi/Terminal’den jupyter console komutu ile başlatılır ve orijinal Python shell’ine göre otomatik tamamlama gibi kullanıcı dostu özellikleri barındırır. Tarayıcı tabanlı defter (notebook) sunması, Komut İstemi/Terminal’den jupyter notebook komutu ile başlatılır, açılan tarayıcı penceresinden yeni defter oluşturularak çeşitli programlama dillerinde kodlar yazılabilir ve bu kodlar çalıştırılarak çıktıları (metin, grafik, vs) etkileşimli olarak direkt tarayıcıda görüntülenebilir.
Tag: r kernel
Blog
Jupyter Notebook ile R Programlama - R Kernel Kurulumu
Daha önceki bir yazımda [Jupyter’in kurulumundan ve Jupyter Notebook]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %})’tan bahsetmiştim. Jupyter’in kurulumu Jupyter Notebook’a Python kernelini direkt kuruyor ve Python ile programlamayı mümkün kılıyor ancak biyoenformatikte sıkça kullanılacak bir diğer programlama dili olan R programlama için ilgili kerneli ekstra kurmak gerekiyor. Bu yazımda bu kernelin kurulumundan bahsedeceğim.
Öncelikle [Jupyter kurulumu]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %}) ve R kurulumu yapılmış olması gerekiyor.
Daha sonra Terminal’den aşağıdaki komutu kullanarak bir R oturumu başlatın:
Tag: r programlama
Blog
Jupyter Notebook ile R Programlama - R Kernel Kurulumu
Daha önceki bir yazımda [Jupyter’in kurulumundan ve Jupyter Notebook]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %})’tan bahsetmiştim. Jupyter’in kurulumu Jupyter Notebook’a Python kernelini direkt kuruyor ve Python ile programlamayı mümkün kılıyor ancak biyoenformatikte sıkça kullanılacak bir diğer programlama dili olan R programlama için ilgili kerneli ekstra kurmak gerekiyor. Bu yazımda bu kernelin kurulumundan bahsedeceğim.
Öncelikle [Jupyter kurulumu]({% post_url 2018-03-31-jupyter-python-nedir-nasil-kurulur %}) ve R kurulumu yapılmış olması gerekiyor.
Daha sonra Terminal’den aşağıdaki komutu kullanarak bir R oturumu başlatın:
Tag: anaconda
Blog
Jupyter / Python Nedir, Nasıl Kurulur?
Jupyter çeşitli programlama dilleri için etkileşimli bir ortam sağlayan yazılımdır. Orijinal olarak IPython (interactive python) adıyla, Python programlama dili için geliştirildi ancak daha sonra kurucuları Jupyter projesini başlatıp IPython’ın birçok tarafını Jupyter’e kaydırdı. IPython, sadece Jupyter’in kerneli olarak devam ediyor.
Jupyter’in özellikleri;
Etkileşimli bir shell sunması, Komut İstemi/Terminal’den jupyter console komutu ile başlatılır ve orijinal Python shell’ine göre otomatik tamamlama gibi kullanıcı dostu özellikleri barındırır. Tarayıcı tabanlı defter (notebook) sunması, Komut İstemi/Terminal’den jupyter notebook komutu ile başlatılır, açılan tarayıcı penceresinden yeni defter oluşturularak çeşitli programlama dillerinde kodlar yazılabilir ve bu kodlar çalıştırılarak çıktıları (metin, grafik, vs) etkileşimli olarak direkt tarayıcıda görüntülenebilir.
Tag: ipython
Blog
Jupyter / Python Nedir, Nasıl Kurulur?
Jupyter çeşitli programlama dilleri için etkileşimli bir ortam sağlayan yazılımdır. Orijinal olarak IPython (interactive python) adıyla, Python programlama dili için geliştirildi ancak daha sonra kurucuları Jupyter projesini başlatıp IPython’ın birçok tarafını Jupyter’e kaydırdı. IPython, sadece Jupyter’in kerneli olarak devam ediyor.
Jupyter’in özellikleri;
Etkileşimli bir shell sunması, Komut İstemi/Terminal’den jupyter console komutu ile başlatılır ve orijinal Python shell’ine göre otomatik tamamlama gibi kullanıcı dostu özellikleri barındırır. Tarayıcı tabanlı defter (notebook) sunması, Komut İstemi/Terminal’den jupyter notebook komutu ile başlatılır, açılan tarayıcı penceresinden yeni defter oluşturularak çeşitli programlama dillerinde kodlar yazılabilir ve bu kodlar çalıştırılarak çıktıları (metin, grafik, vs) etkileşimli olarak direkt tarayıcıda görüntülenebilir.
Tag: python
Blog
Jupyter / Python Nedir, Nasıl Kurulur?
Jupyter çeşitli programlama dilleri için etkileşimli bir ortam sağlayan yazılımdır. Orijinal olarak IPython (interactive python) adıyla, Python programlama dili için geliştirildi ancak daha sonra kurucuları Jupyter projesini başlatıp IPython’ın birçok tarafını Jupyter’e kaydırdı. IPython, sadece Jupyter’in kerneli olarak devam ediyor.
Jupyter’in özellikleri;
Etkileşimli bir shell sunması, Komut İstemi/Terminal’den jupyter console komutu ile başlatılır ve orijinal Python shell’ine göre otomatik tamamlama gibi kullanıcı dostu özellikleri barındırır. Tarayıcı tabanlı defter (notebook) sunması, Komut İstemi/Terminal’den jupyter notebook komutu ile başlatılır, açılan tarayıcı penceresinden yeni defter oluşturularak çeşitli programlama dillerinde kodlar yazılabilir ve bu kodlar çalıştırılarak çıktıları (metin, grafik, vs) etkileşimli olarak direkt tarayıcıda görüntülenebilir.
Blog
Install Cairo Graphics and PyCairo on Ubuntu 14.04 / Linux Mint 17
Cairo is a 2D graphics library implemented as a library written in the C programming language but if you’d like to use Python programming language, you should also install Python bindings for Cairo.
This guide will go through installation of Cairo Graphics library version 1.14.2 (most recent) and py2cairo Python bindings version 1.10.1 (also most recent).
Install Cairo
It’s very easy with the following repository. Just add it, update your packages and install.
Blog
Install RDKit 2015-03 Build on Ubuntu 14.04 / Linux Mint 17
RDKit is an open source toolkit for cheminformatics. It has many functionalities to work with chemical files.
Follow the below guide to install RDKit 2015-03 build on an Ubuntu 14.04 / Linux Mint 17 computer. Since Ubuntu packages don’t have the latest RDKit for trusty, you have to build RDKit from its source.
Install Dependencies
1sudo apt-get install flex bison build-essential python-numpy cmake python-dev sqlite3 libsqlite3-dev libboost1.54-all-dev Download the Build
Blog
Simple Way of Python's subprocess.Popen with a Timeout Option
subprocess module in Python provides us a variety of methods to start a process from a Python script. We may use these methods to run an external commands / programs, collect their output and manage them. An example use of it might be as following:
1from subprocess import Popen, PIPE 2 3 4p = Popen(['ls', '-l'], stdout=PIPE, stderr=PIPE) 5stdout, stderr = p.communicate() 6print stdout, stderr These lines can be used to run ls -l command in Terminal and collect the output (standard output and standard error) in stdout and stderr variables using communicate method defined in the process.
Blog
ImportError: Reportlab Version 2.1+ is needed
Little bug in xhtml2pdf version 0.0.5. To fix:
$ sudo nano /usr/local/lib/python2.7/dist-packages/xhtml2pdf/util.py Change the following lines:
1if not (reportlab.Version[0] == "2" and reportlab.Version[2] >= "1"): 2 raise ImportError("Reportlab Version 2.1+ is needed!") 3 4REPORTLAB22 = (reportlab.Version[0] == "2" and reportlab.Version[2] >= "2") With these lines:
1if not (reportlab.Version[:3] >= "2.1"): 2 raise ImportError("Reportlab Version 2.1+ is needed!") 3 4REPORTLAB22 = (reportlab.Version[:3] >= "2.1")
Blog
Django Migrations Table Already Exists Fix
Fix this issue by faking the migrations:
python manage.py migrate –fake <appname> Taken from this SO answer
Blog
Mezzanine BS Banners Translation with django-modeltranslation
Mezzanine BS Banners is a nice app for implementing Bootstrap 3 banners/sliders to your Mezzanine projects. The Banners model in BS Banners app has a title and its stacked inline Slides model has title and content for translation.
After [installing and setting up Django/Mezzanine translations]({% post_url 2015-07-01-djangomezzanine-content-translation-for-mezzanine %}):
Create a translation.py inside your Mezzanine project or your custom theme/skin application and copy/paste following lines:
1from modeltranslation.translator import translator 2from mezzanine.core.translation import TranslatedSlugged, TranslatedRichText 3from mezzanine_bsbanners.
Blog
Django/Mezzanine Content Translation for Mezzanine Built-in Applications
As Mezzanine comes with additional Django applications such as pages, galleries and to translate their content, Mezzanine supports django-modeltranslation integration.
Install django-modeltranslation:
pip install django-modeltranslation Add following to the INSTALLED_APPS in settings.py:
1"modeltranslation", And following in settings.py:
1USE_MODELTRANSLATION = True Also, move mezzanine.pages to the top of other Mezzanine apps in INSTALLED_APPS in settings.py like so:
1"mezzanine.pages", 2"mezzanine.boot", 3"mezzanine.conf", 4"mezzanine.core", 5"mezzanine.generic", 6"mezzanine.blog", 7"mezzanine.forms", 8"mezzanine.galleries", 9"mezzanine.twitter", 10"mezzanine.accounts", 11"mezzanine.mobile", Run following to create fields in database tables for translations:
Blog
Setting Up Templates and Python Scripts for Translation
Templates need following template tag:
1{% raw %}{% load i18n %}{% endraw %} Then, wrapping any text with
1{% raw %}{% trans "TEXT" %}{% endraw %} will make it translatable via Rosetta Django application
In Python scripts, you need to import following library:
from django.utils.translation import ugettext_lazy as _ Then wrapping any text with
1_('TEXT') will make it translatable.
Blog
Django Rosetta Translations for Django Applications
Make a directory called locale/ under the application directory:
cd app_name mkdir locale Add the folder in LOCAL_PATHS dictionary in settings.py:
1LOCALE_PATHS = ( 2 os.path.join(PROJECT_ROOT, 'app_name', 'locale/'), 3) Run the following command to create PO translation file for the application:
python ../manage.py makemessages -l tr -e html,py,txt python ../manage.py compilemessages Option -l is for language, it should match your definition in settings.py:
1LANGUAGES = ( 2 ('en' _('English')), 3 ('tr' _('Turkish')), 4 ('it' _('Italian')), 5) Repeat the last step for all languages and the go to Rosetta URL to translate.
Blog
Django Rosetta Installation
Install SciPy:
sudo apt-get install python-numpy python-scipy python-matplotlib ipython ipython-notebook python-pandas python-sympy python-nose Install pymongo and nltk:
sudo pip install pymongo sudo pip install nltk Install Python MySQLdb:
sudo apt-get install python-mysqldb Install Rosetta:
sudo pip install django-rosetta Add following into INSTALLED_APPS in settings.py:
1"rosetta", Add following into urls.py:
url(r’^translations/’, include(‘rosetta.urls’)), To also allow language prefixes , change patters to i18n_patterns in urls.py:
1urlpatterns += i18n_patterns( 2 ... 3)
Blog
Errno 13 Permission denied Django File Uploads
Run following command to give www-data permissions to static folder and all its content:
cd path/to/your/django/project sudo chown -R www-data:www-data static/ Do this in your production server
Blog
Configuring Mezzanine for Apache server & mod_wsgi in AWS
Install [Mezzanine]({% post_url 2015-05-01-how-to-install-mezzanine-on-ubuntulinux-mint %}), [Apache server]({% post_url 2015-05-08-getting-started-with-your-aws-instance-and %}) and mod_wsgi:
sudo apt-get install libapache2-mod-wsgi sudo a2enmod wsgi Set up a MySQL database for your Mezzanine project
Read [my post on how to set up a MySQL database for a Mezzanine project]({% post_url 2015-05-09-how-to-set-up-a-mysql-database-for-a-mezzanine %})
Collect static files:
python manage.py collectstatic Configure your Apache server configuration for the project like following:
WSGIPythonPath /home/ubuntu/www/mezzanine-project <VirtualHost *:80> #ServerName example.com ServerAdmin admin@example.com DocumentRoot /home/ubuntu/www/mezzanine-project WSGIScriptAlias / /home/ubuntu/www/mezzanine-project/wsgi.
Blog
How to Set Up a MySQL Database for a Mezzanine Project
Install MySQL server and python-mysqldb package:
sudo apt-get install mysql-server sudo apt-get install python-mysqldb Run MySQL:
mysql -u root -p Create a database:
mysql> create database mezzanine_project; Confirm it:
mysql> show databases; Exit:
mysql> exit Configure local_settings.py:
cd path/to/your/mezzanine/projectnano local_settings.py Like following:
1DATABASES = { 2 "default": { 3 "ENGINE": "django.db.backends.mysql", 4 "NAME": "mezzanine_project", 5 "USER": "root", 6 "PASSWORD": "123456", 7 "HOST": "", 8 "PORT": "", 9 } 10 } Note: Replace your password
Blog
How to Install Mezzanine on Ubuntu/Linux Mint [Complete Guide]
Mezzanine is a CMS application built on Django web framework. The installation steps are easy but your environment may not just suitable enough for it work without a problem. So, here I’m going to describe complete installation from scratch on a virtual environment.
First of all, install virtualenv:
$ sudo apt-get install python-virtualenv Then, create a virtual environment:
$ virtualenv testenv And, activate it: $ cd testenv $ source bin/activate
Blog
Finding k-cores and Clustering Coefficient Computation with NetworkX
Assume you have a large network and you want to find k-cores of each node and also you want to compute clustering coefficient for each one. Python package NetworkX comes with very nice methods for you to easily do these.
k-core is a maximal subgraph whose nodes are at least k degree [1]. To find k-cores:
Add all edges you have in your network in a NetworkX graph, and use core_number method that gets graph as the single input and returns node – k-core pairs.
Blog
Searching Open Reading Frames (ORF) in DNA sequences - ORF Finder
Open reading frames (ORF) are regions on DNA which are translated into protein. They are in between start and stop codons and they are usually long.
The Python script below searches for ORFs in six frames and returns the longest one. It doesn’t consider start codon as a delimiter and only splits the sequence by stop codons. So the ORF can start with any codon but ends with a stop codon (TAG, TGA, TAA).
Blog
Python: Get Longest String in a List
Here is a quick Python trick you might use in your code.
Assume you have a list of strings and you want to get the longest one in the most efficient way.
1>>>l=["aaa", "bb", "c"] 2>>>longest_string = max(l, key = len) 3>>>longest_string 4'aaa'
Blog
Python: defaultdict(list) Dictionary of Lists
Most of the time, when you need to work on large data, you’ll have to use some dictionaries in Python. Dictionaries of lists are very useful to store large data in very organized way. You can always initiate them by initiating empty lists inside an empty dictionary but when you don’t know how many of them you’ll end up with and if you want an easier option, use defaultdict(list). You just need to import it, first:
Blog
Python: extend() Append Elements of a List to a List
When you append a list to a list by using append() method, you’ll see your list is going to be appended as a list:
1>>>l=["a"] 2>>>l2=["a", "b"] 3>>>l.append(l2) 4>>>l 5['a', ['a', 'b']] If you want to append elements of the list directly without creating nested lists, use extend() method:
1>>>l=["a"] 2>>>l2=["a", "b"] 3>>>l.extend(l2) 4>>>l 5['a', 'a', 'b']
Blog
Salmonella Data Preprocessing for PCSF Algorithm
This post describes data preprocessing in Salmonella project for Prize-Collecting Steiner Forest Problem (PCSF) algorithm.
Salmonella data taken from Table S6 in Phosphoproteomic Analysis of Salmonella-Infected Cells Identifies Key Kinase Regulators and SopB-Dependent Host Phosphorylation Events by Rogers, LD et al. has been converted to tab delimited TXT file from its original XLS file for easy reading in Python.
The data should be separated into time points files (2, 5, 10 and 20 minutes) each of which will contain corresponding phophoproteins and their fold changes.
Blog
How to Install openpyxl on Windows
openpyxl is a Python library to read/write Excel 2007 xlsx/xlsm files. To download and install on Windows:
Download it from Python Packages
Then to install, extract the tar ball you downloaded, open up CMD, navigate to the folder that you extracted and run the following:
C:\Users\Gungor>cd Downloads\openpyxl-2.1.2.tar\dist\openpyxl-2.1.2\openpyxl-2.1.2 C:\Users\Gungor\Downloads\openpyxl-2.1.2.tar\dist\openpyxl-2.1.2\openpyxl-2.1.2>python setup.py install It’s going to install everything and will report any error. If there is nothing that seems like an error. You’re good to go.
Blog
How to Install Numpy Python Package on Windows
Numpy (Numerical Python) is a great Python package that you should definitely make use of if you’re doing scientific computing
Installing it on Windows might be difficult if you don’t know how to do it via command line. There are unofficial Windows binaries for Numpy for Windows 32 and 64 bit which make it super easy to install.
Go to the link below and download the one for your system and Python version:http://www.
Blog
JointSNVMix Installation on Linux Mint 16 Cython, Pysam Included
JointSNVMix is a software package that consists of a number of tools for calling somatic mutations in tumour/normal paired NGS data.
It requires Python (>= 2.7), Cython (>= 0.13) and Pysam (== 0.5.0).
Python must be installed by default ona Linux machine so I will describe the installation of others and JointSNVMix.
Note this guide may become outdated after some time so please make sure before following all.
Install Cython
Blog
Set Up Google Cloud SDK on Windows using Cygwin
Windows isn’t the best environment for software development I believe but if you have to use it there are nice softwares to make it easy for you. Cygwin here will help us to use Google Cloud tools but installation requires certain things that you should be aware of beforehand.
You’ll need
Python latest 2.7.x Google Cloud SDK Cygwin 32-bit (i.e. setup-x86.exe - note only this one works) openssh, curl and latest 2.
Blog
Some String Functions in R, String Manipulation in R
I have programmed with Perl, Python, and PHP before, and string manipulation was more direct and easier in them than in R. But still there are useful functions for string manipulation in R. I’m not an expert in R but I’ve been dealing with it for a while and I’ve learned some good functions for this purpose.
Concatenate strings
Concatenation is done with paste function. It gets concatenated strings as arguments separated bu comma and also separator character(s).
Blog
First Impressions and Thoughts on Rosalind Project
Actually, I signed up Rosalind.info 8 months ago, I didn’t really play around with it. But last week, in a BiGCaT science cafe, after I learnt it, I was more interested than before and I just started solving problems.
In each problem, you have a description about the context and also about the problem. Also, there is a sample input and output. Sometimes there are hints about the solution. What I did was to write a solution that works for the sample and hopefully for the problem.
Tag: blog
Blog
Tags Cloud Sorted by Post Count for Jekyll Blogs without Plugins
Recently, I have been trying to transfer my old posts in a Blogger blog to my new Jekyll blog since I really liked this way of blogging. But there were some features that I like in Blogger and wasn’t supported in Jekyll by default. I did some research and found a very nice way of generating tags cloud the my blog.
Although I build my blog locally and then push to GitHub pages, I still try not to use a custom plugin.
Blog
Hoş Geldim! Hoş Geldiniz!
Merhabalar,
Biyoloji alanında özel olarak ilgi alanım olan ve daha fazla keşfetmem, üzerine çok şey öğrenmem gereken Biyoenformatik’i, bu blog aracılığıyla (olası ziyaretçilerimle birlikte) öğreneceğim. İlk yazımı biraz önce Biyoenformatik’in çeşitli otoriteler tarafından yapılan tanımları ile tamamladım. Daha sonra, Biyoenformatik’te geçen birçok ilkelerin tanımlarından da bahsetmek istiyorum. Ayrıca, Biyoenformatik hakkında yazılım dilleri, istatiksel yöntemler de yazılarımın konularını oluşturacak. Aynı zamanda Biyoenformatik ile ilgili haberlere de yer vermek ve bu haberlerle en son gelişmeleri takip etmeyi (ettirmeyi) planlıyorum.
Tag: cloud
Blog
Tags Cloud Sorted by Post Count for Jekyll Blogs without Plugins
Recently, I have been trying to transfer my old posts in a Blogger blog to my new Jekyll blog since I really liked this way of blogging. But there were some features that I like in Blogger and wasn’t supported in Jekyll by default. I did some research and found a very nice way of generating tags cloud the my blog.
Although I build my blog locally and then push to GitHub pages, I still try not to use a custom plugin.
Tag: jekyll
Blog
Tags Cloud Sorted by Post Count for Jekyll Blogs without Plugins
Recently, I have been trying to transfer my old posts in a Blogger blog to my new Jekyll blog since I really liked this way of blogging. But there were some features that I like in Blogger and wasn’t supported in Jekyll by default. I did some research and found a very nice way of generating tags cloud the my blog.
Although I build my blog locally and then push to GitHub pages, I still try not to use a custom plugin.
Tag: tags
Blog
Tags Cloud Sorted by Post Count for Jekyll Blogs without Plugins
Recently, I have been trying to transfer my old posts in a Blogger blog to my new Jekyll blog since I really liked this way of blogging. But there were some features that I like in Blogger and wasn’t supported in Jekyll by default. I did some research and found a very nice way of generating tags cloud the my blog.
Although I build my blog locally and then push to GitHub pages, I still try not to use a custom plugin.
Tag: tags cloud
Blog
Tags Cloud Sorted by Post Count for Jekyll Blogs without Plugins
Recently, I have been trying to transfer my old posts in a Blogger blog to my new Jekyll blog since I really liked this way of blogging. But there were some features that I like in Blogger and wasn’t supported in Jekyll by default. I did some research and found a very nice way of generating tags cloud the my blog.
Although I build my blog locally and then push to GitHub pages, I still try not to use a custom plugin.
Tag: database diagram
Blog
How to Generate Database EER Diagrams from SQL Scripts using MySQL Workbench
MySQL Workbench makes it really easy to generate EER diagrams from SQL scripts. Follow below steps to make one for yourself.
Download and install MySQL Workbench for your system.
See below simple SQL commands, later I’ll use them to generate a sample diagram.
1create table country ( 2 id integer primary key, 3 name CHAR(55)); 4 5create table city ( 6 id integer primary key, 7 country_id integer, 8 name CHAR(55), 9 foreign key (country_id) references country(id)); Open MySQL Workbench and create a new model (File -> New Model).
Tag: eer diagram
Blog
How to Generate Database EER Diagrams from SQL Scripts using MySQL Workbench
MySQL Workbench makes it really easy to generate EER diagrams from SQL scripts. Follow below steps to make one for yourself.
Download and install MySQL Workbench for your system.
See below simple SQL commands, later I’ll use them to generate a sample diagram.
1create table country ( 2 id integer primary key, 3 name CHAR(55)); 4 5create table city ( 6 id integer primary key, 7 country_id integer, 8 name CHAR(55), 9 foreign key (country_id) references country(id)); Open MySQL Workbench and create a new model (File -> New Model).
Tag: mysql
Blog
How to Generate Database EER Diagrams from SQL Scripts using MySQL Workbench
MySQL Workbench makes it really easy to generate EER diagrams from SQL scripts. Follow below steps to make one for yourself.
Download and install MySQL Workbench for your system.
See below simple SQL commands, later I’ll use them to generate a sample diagram.
1create table country ( 2 id integer primary key, 3 name CHAR(55)); 4 5create table city ( 6 id integer primary key, 7 country_id integer, 8 name CHAR(55), 9 foreign key (country_id) references country(id)); Open MySQL Workbench and create a new model (File -> New Model).
Blog
Get Size of MySQL Databases
Use below query in MySQL command prompt to get a table of databases and their sizes in MB.
SELECT table_schema "DB Name", Round(Sum(data_length + index_length) / 1024 / 1024, 1) "DB Size in MB" FROM information_schema.tables GROUP BY table_schema;
Blog
Configuring Mezzanine for Apache server & mod_wsgi in AWS
Install [Mezzanine]({% post_url 2015-05-01-how-to-install-mezzanine-on-ubuntulinux-mint %}), [Apache server]({% post_url 2015-05-08-getting-started-with-your-aws-instance-and %}) and mod_wsgi:
sudo apt-get install libapache2-mod-wsgi sudo a2enmod wsgi Set up a MySQL database for your Mezzanine project
Read [my post on how to set up a MySQL database for a Mezzanine project]({% post_url 2015-05-09-how-to-set-up-a-mysql-database-for-a-mezzanine %})
Collect static files:
python manage.py collectstatic Configure your Apache server configuration for the project like following:
WSGIPythonPath /home/ubuntu/www/mezzanine-project <VirtualHost *:80> #ServerName example.com ServerAdmin admin@example.com DocumentRoot /home/ubuntu/www/mezzanine-project WSGIScriptAlias / /home/ubuntu/www/mezzanine-project/wsgi.
Blog
How to Set Up a MySQL Database for a Mezzanine Project
Install MySQL server and python-mysqldb package:
sudo apt-get install mysql-server sudo apt-get install python-mysqldb Run MySQL:
mysql -u root -p Create a database:
mysql> create database mezzanine_project; Confirm it:
mysql> show databases; Exit:
mysql> exit Configure local_settings.py:
cd path/to/your/mezzanine/projectnano local_settings.py Like following:
1DATABASES = { 2 "default": { 3 "ENGINE": "django.db.backends.mysql", 4 "NAME": "mezzanine_project", 5 "USER": "root", 6 "PASSWORD": "123456", 7 "HOST": "", 8 "PORT": "", 9 } 10 } Note: Replace your password
Blog
How to Clear (or Drop) DB Table of A Django App
Let’s say you created a Django app and ran python manage.py syncdb and created its table. Everytime you make a change in the table, you’ll need to drop that table and run python manage.py syncdb again to update. And how you drop a table of a Django app:
$ python manage.py sqlclear app_name | python manage.py dbshell Drop tables of an app with migrations (Django >= 1.8):
$ python manage.py migrate appname zero Recreate all the tables:
Blog
Install Apache2, PHP5, MySQL & phpMyAdmin on Ubuntu 12.04
First, install apache2:
sudo apt-get install apache2 Then, for it to work: sudo service apache2 restart
For custom www folder:
sudo cp /etc/apache2/sites-available/default /etc/apache2/sites-available/www gksudo gedit /etc/apache2/sites-available/www Change DocumentRoot and Directory directive to point to new location. For example, /home/user/www/
Save and see (link here clean URLs not working Laravel 4)
Make www default and disable default:
sudo a2dissite default && sudo a2ensite www sudo service apache2 restart Create new file in www
Blog
Retrieving Data with AJAX using jQuery, PHP and MySQL
Last semester, I took a course from Informatics Institute at METU called “Biological Databases and Data Analysis Tools” where first we learned what is a database and how to do queries on it. Also, the technology behind databases are taught. Then, we learned many biological databases and data analysis tools available. These include gene, protein and pathway databases, tools for creating databases.
As a final project, we were asked to create an online tool that can search a database and get the data and display it on any web browsers.
Tag: mysql workbench
Blog
How to Generate Database EER Diagrams from SQL Scripts using MySQL Workbench
MySQL Workbench makes it really easy to generate EER diagrams from SQL scripts. Follow below steps to make one for yourself.
Download and install MySQL Workbench for your system.
See below simple SQL commands, later I’ll use them to generate a sample diagram.
1create table country ( 2 id integer primary key, 3 name CHAR(55)); 4 5create table city ( 6 id integer primary key, 7 country_id integer, 8 name CHAR(55), 9 foreign key (country_id) references country(id)); Open MySQL Workbench and create a new model (File -> New Model).
Tag: sql
Blog
How to Generate Database EER Diagrams from SQL Scripts using MySQL Workbench
MySQL Workbench makes it really easy to generate EER diagrams from SQL scripts. Follow below steps to make one for yourself.
Download and install MySQL Workbench for your system.
See below simple SQL commands, later I’ll use them to generate a sample diagram.
1create table country ( 2 id integer primary key, 3 name CHAR(55)); 4 5create table city ( 6 id integer primary key, 7 country_id integer, 8 name CHAR(55), 9 foreign key (country_id) references country(id)); Open MySQL Workbench and create a new model (File -> New Model).
Tag: awk
Blog
Replace Entire Column with a Number in Bash
Use below awk one-liner to replace all values in a column (5th column in example) with a value (1 in example).
awk "{$5=1} {print}" filename > filename.replaced
Tag: one-liner
Blog
Replace Entire Column with a Number in Bash
Use below awk one-liner to replace all values in a column (5th column in example) with a value (1 in example).
awk "{$5=1} {print}" filename > filename.replaced
Tag: replace column
Blog
Replace Entire Column with a Number in Bash
Use below awk one-liner to replace all values in a column (5th column in example) with a value (1 in example).
awk "{$5=1} {print}" filename > filename.replaced
Tag: config
Blog
Make a Shortcut for SSH Connections
It could be really annoying to reenter the host name again and again if you are working over ssh and the host name is really long (i.e. mistral.ii.metu.edu.tr). Using this method, you can set a shorcut for the host name (i.e. mistral) and use it whenever you connect.
Open ~/.ssh/config for editing:
subl ~/.ssh/config Add your host definition as follows:
Host mistral HostName mistral.ii.metu.edu.tr User gbudak
Tag: connections
Blog
Make a Shortcut for SSH Connections
It could be really annoying to reenter the host name again and again if you are working over ssh and the host name is really long (i.e. mistral.ii.metu.edu.tr). Using this method, you can set a shorcut for the host name (i.e. mistral) and use it whenever you connect.
Open ~/.ssh/config for editing:
subl ~/.ssh/config Add your host definition as follows:
Host mistral HostName mistral.ii.metu.edu.tr User gbudak
Tag: make
Blog
Make a Shortcut for SSH Connections
It could be really annoying to reenter the host name again and again if you are working over ssh and the host name is really long (i.e. mistral.ii.metu.edu.tr). Using this method, you can set a shorcut for the host name (i.e. mistral) and use it whenever you connect.
Open ~/.ssh/config for editing:
subl ~/.ssh/config Add your host definition as follows:
Host mistral HostName mistral.ii.metu.edu.tr User gbudak
Tag: shortcuts
Blog
Make a Shortcut for SSH Connections
It could be really annoying to reenter the host name again and again if you are working over ssh and the host name is really long (i.e. mistral.ii.metu.edu.tr). Using this method, you can set a shorcut for the host name (i.e. mistral) and use it whenever you connect.
Open ~/.ssh/config for editing:
subl ~/.ssh/config Add your host definition as follows:
Host mistral HostName mistral.ii.metu.edu.tr User gbudak
Tag: ssh
Blog
Make a Shortcut for SSH Connections
It could be really annoying to reenter the host name again and again if you are working over ssh and the host name is really long (i.e. mistral.ii.metu.edu.tr). Using this method, you can set a shorcut for the host name (i.e. mistral) and use it whenever you connect.
Open ~/.ssh/config for editing:
subl ~/.ssh/config Add your host definition as follows:
Host mistral HostName mistral.ii.metu.edu.tr User gbudak
Blog
Passwordless SSH for Mac/Linux
You don’t have to enter the ssh password everytime you make a connection. Use below method to generate a key, copy it to the host you want to connect and connect anytime without entering your password.
Generate a keygen:
1ssh-keygen Copy the key to remote host:
1ssh-copy-id root@linuxconfig.org
Blog
Uploading Files to AWS using SSH/SCP
Here is a small command for uploading files to AWS through SSH’s command scp (secure copy).
scp -i path/to/your/key-pairs/file path/to/file/you/want/to/upload ubuntu@PUBLIC_DNS:path/to/the/destination
Blog
AWS Start an Instance and Connect to it
Go to EC2 management console
Create a new key-pair if necessary and download it
Launch an instance
Add HTTP security group for web applications over HTTP
Get public DNS
Change permissions on key-pair file:
1chmod 400 path/to/your/file.pem Connect:
1ssh -i path/to/your/file.pem ubuntu@PUBLIC_DNS Note: ubuntu is for connecting an Ubuntu 64 bit instance. It’s different for others
Tag: mac
Blog
Passwordless SSH for Mac/Linux
You don’t have to enter the ssh password everytime you make a connection. Use below method to generate a key, copy it to the host you want to connect and connect anytime without entering your password.
Generate a keygen:
1ssh-keygen Copy the key to remote host:
1ssh-copy-id root@linuxconfig.org
Tag: passwordless
Blog
Passwordless SSH for Mac/Linux
You don’t have to enter the ssh password everytime you make a connection. Use below method to generate a key, copy it to the host you want to connect and connect anytime without entering your password.
Generate a keygen:
1ssh-keygen Copy the key to remote host:
1ssh-copy-id root@linuxconfig.org
Tag: compute significance
Blog
Computing Significance of Overlap between Two Sets using Hypergeometric Test
There are many cases where we have two sets (e.g. under two different conditions) of things such as transcripts, genes or proteins and we want to compute the significance of the overlap between them. Hypergeometric test is very simple and widely used option for such cases.
I’ll use the phyper function in R but you can use the same idea in SciPy (Python).
Let’s say you have from 200 genes (A);
Tag: gene sets
Blog
Computing Significance of Overlap between Two Sets using Hypergeometric Test
There are many cases where we have two sets (e.g. under two different conditions) of things such as transcripts, genes or proteins and we want to compute the significance of the overlap between them. Hypergeometric test is very simple and widely used option for such cases.
I’ll use the phyper function in R but you can use the same idea in SciPy (Python).
Let’s say you have from 200 genes (A);
Tag: genes
Blog
Computing Significance of Overlap between Two Sets using Hypergeometric Test
There are many cases where we have two sets (e.g. under two different conditions) of things such as transcripts, genes or proteins and we want to compute the significance of the overlap between them. Hypergeometric test is very simple and widely used option for such cases.
I’ll use the phyper function in R but you can use the same idea in SciPy (Python).
Let’s say you have from 200 genes (A);
Blog
An Exon of Length 2 Appeared in Ensembl
I want to share an interesting finding about our research on exon/intron analysis of human evolutionary history.
So I had the genes that emerged at each pass point of human history and I was using Ensembl API to get exons and introns of these genes to perform further analyses.
There was one gene (ENSG00000197568 - HERV-H LTR-associating 3 - HHLA3) with a surprise. Because it’s one transcript (ENST00000432224) had an exon (ENSE00001707577) of length 2.
Tag: hypergeometric distribution
Blog
Computing Significance of Overlap between Two Sets using Hypergeometric Test
There are many cases where we have two sets (e.g. under two different conditions) of things such as transcripts, genes or proteins and we want to compute the significance of the overlap between them. Hypergeometric test is very simple and widely used option for such cases.
I’ll use the phyper function in R but you can use the same idea in SciPy (Python).
Let’s say you have from 200 genes (A);
Tag: hypergeometric test
Blog
Computing Significance of Overlap between Two Sets using Hypergeometric Test
There are many cases where we have two sets (e.g. under two different conditions) of things such as transcripts, genes or proteins and we want to compute the significance of the overlap between them. Hypergeometric test is very simple and widely used option for such cases.
I’ll use the phyper function in R but you can use the same idea in SciPy (Python).
Let’s say you have from 200 genes (A);
Tag: p-value
Blog
Computing Significance of Overlap between Two Sets using Hypergeometric Test
There are many cases where we have two sets (e.g. under two different conditions) of things such as transcripts, genes or proteins and we want to compute the significance of the overlap between them. Hypergeometric test is very simple and widely used option for such cases.
I’ll use the phyper function in R but you can use the same idea in SciPy (Python).
Let’s say you have from 200 genes (A);
Tag: phyper
Blog
Computing Significance of Overlap between Two Sets using Hypergeometric Test
There are many cases where we have two sets (e.g. under two different conditions) of things such as transcripts, genes or proteins and we want to compute the significance of the overlap between them. Hypergeometric test is very simple and widely used option for such cases.
I’ll use the phyper function in R but you can use the same idea in SciPy (Python).
Let’s say you have from 200 genes (A);
Tag: proteins
Blog
Computing Significance of Overlap between Two Sets using Hypergeometric Test
There are many cases where we have two sets (e.g. under two different conditions) of things such as transcripts, genes or proteins and we want to compute the significance of the overlap between them. Hypergeometric test is very simple and widely used option for such cases.
I’ll use the phyper function in R but you can use the same idea in SciPy (Python).
Let’s say you have from 200 genes (A);
Tag: sets
Blog
Computing Significance of Overlap between Two Sets using Hypergeometric Test
There are many cases where we have two sets (e.g. under two different conditions) of things such as transcripts, genes or proteins and we want to compute the significance of the overlap between them. Hypergeometric test is very simple and widely used option for such cases.
I’ll use the phyper function in R but you can use the same idea in SciPy (Python).
Let’s say you have from 200 genes (A);
Tag: transcripts
Blog
Computing Significance of Overlap between Two Sets using Hypergeometric Test
There are many cases where we have two sets (e.g. under two different conditions) of things such as transcripts, genes or proteins and we want to compute the significance of the overlap between them. Hypergeometric test is very simple and widely used option for such cases.
I’ll use the phyper function in R but you can use the same idea in SciPy (Python).
Let’s say you have from 200 genes (A);
Blog
An Exon of Length 2 Appeared in Ensembl
I want to share an interesting finding about our research on exon/intron analysis of human evolutionary history.
So I had the genes that emerged at each pass point of human history and I was using Ensembl API to get exons and introns of these genes to perform further analyses.
There was one gene (ENSG00000197568 - HERV-H LTR-associating 3 - HHLA3) with a surprise. Because it’s one transcript (ENST00000432224) had an exon (ENSE00001707577) of length 2.
Tag: 20. yıl
Blog
ODTÜ Enformatik Enstitüsü'nün 20. Yılı Etkinliği
ODTÜ Enformatik Enstitüsü kuruluşunun 20. yılını bir bilim festivaliyle kutluyor. 16 Mayıs 2016‘da, ODTÜ Kültür ve Kongre Merkezi’nde gerçekleştirilecek olan bilim festivaline herkes davetlidir!
Bilime, sanat ve müziğin de eşlik edeceği bu festivalde aşağıdaki ana konuşmacılar yer alacaktır:
Prof. Dr. Jennifer Hayes: New England Microsoft Araştırma ve New York Microsoft Araştırma yönetici ve eş kurucu Assoc. Prof. Claudio Ferretti: Milano-Bicocca Üniversitesi, Bilgisayar Bilimi, Sistemleri ve İletişimi Dr. Christian Borgs: Araştırmacı, New England Microsoft Araştırma vekil yönetici ve eş kurucu Etkinliğin Facebook sayfasına gitmek için tıklayın.
Tag: bilim festivali
Blog
ODTÜ Enformatik Enstitüsü'nün 20. Yılı Etkinliği
ODTÜ Enformatik Enstitüsü kuruluşunun 20. yılını bir bilim festivaliyle kutluyor. 16 Mayıs 2016‘da, ODTÜ Kültür ve Kongre Merkezi’nde gerçekleştirilecek olan bilim festivaline herkes davetlidir!
Bilime, sanat ve müziğin de eşlik edeceği bu festivalde aşağıdaki ana konuşmacılar yer alacaktır:
Prof. Dr. Jennifer Hayes: New England Microsoft Araştırma ve New York Microsoft Araştırma yönetici ve eş kurucu Assoc. Prof. Claudio Ferretti: Milano-Bicocca Üniversitesi, Bilgisayar Bilimi, Sistemleri ve İletişimi Dr. Christian Borgs: Araştırmacı, New England Microsoft Araştırma vekil yönetici ve eş kurucu Etkinliğin Facebook sayfasına gitmek için tıklayın.
Tag: enformatik
Blog
ODTÜ Enformatik Enstitüsü'nün 20. Yılı Etkinliği
ODTÜ Enformatik Enstitüsü kuruluşunun 20. yılını bir bilim festivaliyle kutluyor. 16 Mayıs 2016‘da, ODTÜ Kültür ve Kongre Merkezi’nde gerçekleştirilecek olan bilim festivaline herkes davetlidir!
Bilime, sanat ve müziğin de eşlik edeceği bu festivalde aşağıdaki ana konuşmacılar yer alacaktır:
Prof. Dr. Jennifer Hayes: New England Microsoft Araştırma ve New York Microsoft Araştırma yönetici ve eş kurucu Assoc. Prof. Claudio Ferretti: Milano-Bicocca Üniversitesi, Bilgisayar Bilimi, Sistemleri ve İletişimi Dr. Christian Borgs: Araştırmacı, New England Microsoft Araştırma vekil yönetici ve eş kurucu Etkinliğin Facebook sayfasına gitmek için tıklayın.
Blog
Biyoinformatik mi? Yoksa Biyoenformatik mi?
Yazılarıma konu ararken kitaplarla birlikte interneti de karıştırıyorum. Yabancı kaynaklar elbette fazlaca var ve yeterliler, ancak Türkçe kaynaklara baktığımda ilk gözüme çarpan bu alanın isminin farklı kullanımları oldu.
Biliyorsunuz, İngilizcede bu alana bioinformatics deniyor. Gayet normal, çünkü İngilizcede informatics ics eki ile birlikte information sözcüğünden geliyor. Bu sözcük ise Latince kökene sahip1. Enformatik sözcüğü Türkçeye, Fransızcadan informatique sözcüğünden, enformatik olarak gelmiş, ayrıca bilişim olarak da Türkçesi önerilmiş2. Elbette bu Fransızca sözcük de İngilizcesi ile aynı kökene sahip.
Tag: enformatik enstitüsü
Blog
ODTÜ Enformatik Enstitüsü'nün 20. Yılı Etkinliği
ODTÜ Enformatik Enstitüsü kuruluşunun 20. yılını bir bilim festivaliyle kutluyor. 16 Mayıs 2016‘da, ODTÜ Kültür ve Kongre Merkezi’nde gerçekleştirilecek olan bilim festivaline herkes davetlidir!
Bilime, sanat ve müziğin de eşlik edeceği bu festivalde aşağıdaki ana konuşmacılar yer alacaktır:
Prof. Dr. Jennifer Hayes: New England Microsoft Araştırma ve New York Microsoft Araştırma yönetici ve eş kurucu Assoc. Prof. Claudio Ferretti: Milano-Bicocca Üniversitesi, Bilgisayar Bilimi, Sistemleri ve İletişimi Dr. Christian Borgs: Araştırmacı, New England Microsoft Araştırma vekil yönetici ve eş kurucu Etkinliğin Facebook sayfasına gitmek için tıklayın.
Blog
7th International Symposium on Health Informatics and Bioinformatics
7. Sağlık Enformatiği ve Biyoenformatik üzerine Uluslararası Sempozyumu, 7th International Symposium on Health Informatics and Bioinformatics (HIBIT 2012), ilk kez 2005’te ODTÜ Enformatik Enstitüsü tarafından düzenlenmiş ve Sağlık Enformatiği, Tıbbi Enformatik, Hesaplamalı Biyoloji ve Biyoenformatik alanlarında akademisyenleri ve araştırmacıları bir araya getirmeyi ve bu alanlar hakkında yapılan çalışmaların sunulmasına ortam sağlamayı ve çalışmalar üzerine interaktif bir şekilde değerlendirmeler yapmayı amaçlamaktadır.
Bu sene, 19-22 Nisan 2012’de Ürgüp, Nevşehir Perissia Hotel’de düzenlenecek olan HIBIT 2012 organizasyonu ODTÜ, ODTÜ Enformatik Enstitüsü, ODTÜ Biyolojik Bilimler Bölümü ve ODTÜ Bilgisayar Mühendisliği Bölümü partnerliği ile gerçekleştirilmektedir.
Tag: microsoft
Blog
ODTÜ Enformatik Enstitüsü'nün 20. Yılı Etkinliği
ODTÜ Enformatik Enstitüsü kuruluşunun 20. yılını bir bilim festivaliyle kutluyor. 16 Mayıs 2016‘da, ODTÜ Kültür ve Kongre Merkezi’nde gerçekleştirilecek olan bilim festivaline herkes davetlidir!
Bilime, sanat ve müziğin de eşlik edeceği bu festivalde aşağıdaki ana konuşmacılar yer alacaktır:
Prof. Dr. Jennifer Hayes: New England Microsoft Araştırma ve New York Microsoft Araştırma yönetici ve eş kurucu Assoc. Prof. Claudio Ferretti: Milano-Bicocca Üniversitesi, Bilgisayar Bilimi, Sistemleri ve İletişimi Dr. Christian Borgs: Araştırmacı, New England Microsoft Araştırma vekil yönetici ve eş kurucu Etkinliğin Facebook sayfasına gitmek için tıklayın.
Tag: odtü
Blog
ODTÜ Enformatik Enstitüsü'nün 20. Yılı Etkinliği
ODTÜ Enformatik Enstitüsü kuruluşunun 20. yılını bir bilim festivaliyle kutluyor. 16 Mayıs 2016‘da, ODTÜ Kültür ve Kongre Merkezi’nde gerçekleştirilecek olan bilim festivaline herkes davetlidir!
Bilime, sanat ve müziğin de eşlik edeceği bu festivalde aşağıdaki ana konuşmacılar yer alacaktır:
Prof. Dr. Jennifer Hayes: New England Microsoft Araştırma ve New York Microsoft Araştırma yönetici ve eş kurucu Assoc. Prof. Claudio Ferretti: Milano-Bicocca Üniversitesi, Bilgisayar Bilimi, Sistemleri ve İletişimi Dr. Christian Borgs: Araştırmacı, New England Microsoft Araştırma vekil yönetici ve eş kurucu Etkinliğin Facebook sayfasına gitmek için tıklayın.
Blog
7th International Symposium on Health Informatics and Bioinformatics
7. Sağlık Enformatiği ve Biyoenformatik üzerine Uluslararası Sempozyumu, 7th International Symposium on Health Informatics and Bioinformatics (HIBIT 2012), ilk kez 2005’te ODTÜ Enformatik Enstitüsü tarafından düzenlenmiş ve Sağlık Enformatiği, Tıbbi Enformatik, Hesaplamalı Biyoloji ve Biyoenformatik alanlarında akademisyenleri ve araştırmacıları bir araya getirmeyi ve bu alanlar hakkında yapılan çalışmaların sunulmasına ortam sağlamayı ve çalışmalar üzerine interaktif bir şekilde değerlendirmeler yapmayı amaçlamaktadır.
Bu sene, 19-22 Nisan 2012’de Ürgüp, Nevşehir Perissia Hotel’de düzenlenecek olan HIBIT 2012 organizasyonu ODTÜ, ODTÜ Enformatik Enstitüsü, ODTÜ Biyolojik Bilimler Bölümü ve ODTÜ Bilgisayar Mühendisliği Bölümü partnerliği ile gerçekleştirilmektedir.
Tag: code
Blog
Mann Whitney U Test (Wilcoxon Rank-Sum Test) Javascript Implementation
Currently Javascript is really poor in statistical methods compared to Python (SciPy) and R. There are several efforts to fill this gap, most notably from jStat. However, still many functions, distributions and tests are missing in this library. In one of my projects, I had to implement a Javascript version of Mann Whitney U test (or also called Wilcoxon rank-sum test). Here, I’m giving a link to its source code and describing how it works.
Tag: html
Blog
Mann Whitney U Test (Wilcoxon Rank-Sum Test) Javascript Implementation
Currently Javascript is really poor in statistical methods compared to Python (SciPy) and R. There are several efforts to fill this gap, most notably from jStat. However, still many functions, distributions and tests are missing in this library. In one of my projects, I had to implement a Javascript version of Mann Whitney U test (or also called Wilcoxon rank-sum test). Here, I’m giving a link to its source code and describing how it works.
Tag: implementation
Blog
Mann Whitney U Test (Wilcoxon Rank-Sum Test) Javascript Implementation
Currently Javascript is really poor in statistical methods compared to Python (SciPy) and R. There are several efforts to fill this gap, most notably from jStat. However, still many functions, distributions and tests are missing in this library. In one of my projects, I had to implement a Javascript version of Mann Whitney U test (or also called Wilcoxon rank-sum test). Here, I’m giving a link to its source code and describing how it works.
Tag: javascript
Blog
Mann Whitney U Test (Wilcoxon Rank-Sum Test) Javascript Implementation
Currently Javascript is really poor in statistical methods compared to Python (SciPy) and R. There are several efforts to fill this gap, most notably from jStat. However, still many functions, distributions and tests are missing in this library. In one of my projects, I had to implement a Javascript version of Mann Whitney U test (or also called Wilcoxon rank-sum test). Here, I’m giving a link to its source code and describing how it works.
Tag: mann whitney u
Blog
Mann Whitney U Test (Wilcoxon Rank-Sum Test) Javascript Implementation
Currently Javascript is really poor in statistical methods compared to Python (SciPy) and R. There are several efforts to fill this gap, most notably from jStat. However, still many functions, distributions and tests are missing in this library. In one of my projects, I had to implement a Javascript version of Mann Whitney U test (or also called Wilcoxon rank-sum test). Here, I’m giving a link to its source code and describing how it works.
Tag: scipy
Blog
Mann Whitney U Test (Wilcoxon Rank-Sum Test) Javascript Implementation
Currently Javascript is really poor in statistical methods compared to Python (SciPy) and R. There are several efforts to fill this gap, most notably from jStat. However, still many functions, distributions and tests are missing in this library. In one of my projects, I had to implement a Javascript version of Mann Whitney U test (or also called Wilcoxon rank-sum test). Here, I’m giving a link to its source code and describing how it works.
Tag: statistical tests
Blog
Mann Whitney U Test (Wilcoxon Rank-Sum Test) Javascript Implementation
Currently Javascript is really poor in statistical methods compared to Python (SciPy) and R. There are several efforts to fill this gap, most notably from jStat. However, still many functions, distributions and tests are missing in this library. In one of my projects, I had to implement a Javascript version of Mann Whitney U test (or also called Wilcoxon rank-sum test). Here, I’m giving a link to its source code and describing how it works.
Tag: statistics
Blog
Mann Whitney U Test (Wilcoxon Rank-Sum Test) Javascript Implementation
Currently Javascript is really poor in statistical methods compared to Python (SciPy) and R. There are several efforts to fill this gap, most notably from jStat. However, still many functions, distributions and tests are missing in this library. In one of my projects, I had to implement a Javascript version of Mann Whitney U test (or also called Wilcoxon rank-sum test). Here, I’m giving a link to its source code and describing how it works.
Blog
First Impressions and Thoughts on Rosalind Project
Actually, I signed up Rosalind.info 8 months ago, I didn’t really play around with it. But last week, in a BiGCaT science cafe, after I learnt it, I was more interested than before and I just started solving problems.
In each problem, you have a description about the context and also about the problem. Also, there is a sample input and output. Sometimes there are hints about the solution. What I did was to write a solution that works for the sample and hopefully for the problem.
Tag: wilcoxon rank-sum
Blog
Mann Whitney U Test (Wilcoxon Rank-Sum Test) Javascript Implementation
Currently Javascript is really poor in statistical methods compared to Python (SciPy) and R. There are several efforts to fill this gap, most notably from jStat. However, still many functions, distributions and tests are missing in this library. In one of my projects, I had to implement a Javascript version of Mann Whitney U test (or also called Wilcoxon rank-sum test). Here, I’m giving a link to its source code and describing how it works.
Tag: install r package from source
Blog
MiClip 1.3 Installation
MiClip is a CLIP-seq data peak calling algorithm implemented in R but currently it doesn’t show up in the CRAN but you can obtain it from the archive and install from the source or tar.gz file.
Download the tar.gz file:
wget https://cran.r-project.org/src/contrib/Archive/MiClip/MiClip_1.3.tar.gz Start R:
R Install dependencies:
1install.packages("moments") 2install.packages("VGAM") Finally install MiClip 1.3:
1install.packages("MiClip_1.3.tar.gz", repos = NULL, type="source") Then you can test it by loading the package and viewing its help file.
Tag: miclip
Blog
MiClip 1.3 Installation
MiClip is a CLIP-seq data peak calling algorithm implemented in R but currently it doesn’t show up in the CRAN but you can obtain it from the archive and install from the source or tar.gz file.
Download the tar.gz file:
wget https://cran.r-project.org/src/contrib/Archive/MiClip/MiClip_1.3.tar.gz Start R:
R Install dependencies:
1install.packages("moments") 2install.packages("VGAM") Finally install MiClip 1.3:
1install.packages("MiClip_1.3.tar.gz", repos = NULL, type="source") Then you can test it by loading the package and viewing its help file.
Tag: mol to svg
Blog
Generating 2D SVG Images of MOL Files using RDKit Transparent Background
The latest release of RDKit (2015-03) can generate SVG images with several lines of codes but by default the generated SVG image has a white background. The investigations on sources didn’t solve my problem as I couldn’t find any option for setting background to transparent background.
An example of SVG image generation can be found on RDKit blog post called New Drawing Code.
In [3] shows the SVG image generation and it returns the SVG file content in XML.
Tag: rdkit
Blog
Generating 2D SVG Images of MOL Files using RDKit Transparent Background
The latest release of RDKit (2015-03) can generate SVG images with several lines of codes but by default the generated SVG image has a white background. The investigations on sources didn’t solve my problem as I couldn’t find any option for setting background to transparent background.
An example of SVG image generation can be found on RDKit blog post called New Drawing Code.
In [3] shows the SVG image generation and it returns the SVG file content in XML.
Blog
Install RDKit 2015-03 Build on Ubuntu 14.04 / Linux Mint 17
RDKit is an open source toolkit for cheminformatics. It has many functionalities to work with chemical files.
Follow the below guide to install RDKit 2015-03 build on an Ubuntu 14.04 / Linux Mint 17 computer. Since Ubuntu packages don’t have the latest RDKit for trusty, you have to build RDKit from its source.
Install Dependencies
1sudo apt-get install flex bison build-essential python-numpy cmake python-dev sqlite3 libsqlite3-dev libboost1.54-all-dev Download the Build
Tag: svg
Blog
Generating 2D SVG Images of MOL Files using RDKit Transparent Background
The latest release of RDKit (2015-03) can generate SVG images with several lines of codes but by default the generated SVG image has a white background. The investigations on sources didn’t solve my problem as I couldn’t find any option for setting background to transparent background.
An example of SVG image generation can be found on RDKit blog post called New Drawing Code.
In [3] shows the SVG image generation and it returns the SVG file content in XML.
Tag: transparent background
Blog
Generating 2D SVG Images of MOL Files using RDKit Transparent Background
The latest release of RDKit (2015-03) can generate SVG images with several lines of codes but by default the generated SVG image has a white background. The investigations on sources didn’t solve my problem as I couldn’t find any option for setting background to transparent background.
An example of SVG image generation can be found on RDKit blog post called New Drawing Code.
In [3] shows the SVG image generation and it returns the SVG file content in XML.
Tag: cairo
Blog
Install Cairo Graphics and PyCairo on Ubuntu 14.04 / Linux Mint 17
Cairo is a 2D graphics library implemented as a library written in the C programming language but if you’d like to use Python programming language, you should also install Python bindings for Cairo.
This guide will go through installation of Cairo Graphics library version 1.14.2 (most recent) and py2cairo Python bindings version 1.10.1 (also most recent).
Install Cairo
It’s very easy with the following repository. Just add it, update your packages and install.
Tag: cairo 1.14.2
Blog
Install Cairo Graphics and PyCairo on Ubuntu 14.04 / Linux Mint 17
Cairo is a 2D graphics library implemented as a library written in the C programming language but if you’d like to use Python programming language, you should also install Python bindings for Cairo.
This guide will go through installation of Cairo Graphics library version 1.14.2 (most recent) and py2cairo Python bindings version 1.10.1 (also most recent).
Install Cairo
It’s very easy with the following repository. Just add it, update your packages and install.
Tag: cairo graphics
Blog
Install Cairo Graphics and PyCairo on Ubuntu 14.04 / Linux Mint 17
Cairo is a 2D graphics library implemented as a library written in the C programming language but if you’d like to use Python programming language, you should also install Python bindings for Cairo.
This guide will go through installation of Cairo Graphics library version 1.14.2 (most recent) and py2cairo Python bindings version 1.10.1 (also most recent).
Install Cairo
It’s very easy with the following repository. Just add it, update your packages and install.
Tag: py2cairo
Blog
Install Cairo Graphics and PyCairo on Ubuntu 14.04 / Linux Mint 17
Cairo is a 2D graphics library implemented as a library written in the C programming language but if you’d like to use Python programming language, you should also install Python bindings for Cairo.
This guide will go through installation of Cairo Graphics library version 1.14.2 (most recent) and py2cairo Python bindings version 1.10.1 (also most recent).
Install Cairo
It’s very easy with the following repository. Just add it, update your packages and install.
Tag: py2cairo 1.10.1
Blog
Install Cairo Graphics and PyCairo on Ubuntu 14.04 / Linux Mint 17
Cairo is a 2D graphics library implemented as a library written in the C programming language but if you’d like to use Python programming language, you should also install Python bindings for Cairo.
This guide will go through installation of Cairo Graphics library version 1.14.2 (most recent) and py2cairo Python bindings version 1.10.1 (also most recent).
Install Cairo
It’s very easy with the following repository. Just add it, update your packages and install.
Tag: boost
Blog
Install RDKit 2015-03 Build on Ubuntu 14.04 / Linux Mint 17
RDKit is an open source toolkit for cheminformatics. It has many functionalities to work with chemical files.
Follow the below guide to install RDKit 2015-03 build on an Ubuntu 14.04 / Linux Mint 17 computer. Since Ubuntu packages don’t have the latest RDKit for trusty, you have to build RDKit from its source.
Install Dependencies
1sudo apt-get install flex bison build-essential python-numpy cmake python-dev sqlite3 libsqlite3-dev libboost1.54-all-dev Download the Build
Tag: linux mint
Blog
Install RDKit 2015-03 Build on Ubuntu 14.04 / Linux Mint 17
RDKit is an open source toolkit for cheminformatics. It has many functionalities to work with chemical files.
Follow the below guide to install RDKit 2015-03 build on an Ubuntu 14.04 / Linux Mint 17 computer. Since Ubuntu packages don’t have the latest RDKit for trusty, you have to build RDKit from its source.
Install Dependencies
1sudo apt-get install flex bison build-essential python-numpy cmake python-dev sqlite3 libsqlite3-dev libboost1.54-all-dev Download the Build
Blog
ClipCrop Installation on Linux Mint 16 nvm, Node, npm Included
ClipCrop is a tool for detecting structural variations from SAM files. And it’s built with Node.js.
ClipCrop uses two softwares internally so they should be installed first.
Install SHRiMP2
SHRiMP is a software package for aligning genomic reads against a target genome.
1$ mkdir ~/software 2$ cd ~/software 3$ wget http://compbio.cs.toronto.edu/shrimp/releases/SHRiMP_2_2_3.lx26.x86_64.tar.gz 4$ tar xzvf SHRiMP_2_2_3.lx26.x86_64.tar.gz 5$ cd SHRiMP_2_2_3 6$ file bin/gmapper 7$ export SHRIMP_FOLDER=$PWD Install BWA
BWA is a software package for mapping low-divergent sequences against a large reference genome.
Tag: ubuntu
Blog
Install RDKit 2015-03 Build on Ubuntu 14.04 / Linux Mint 17
RDKit is an open source toolkit for cheminformatics. It has many functionalities to work with chemical files.
Follow the below guide to install RDKit 2015-03 build on an Ubuntu 14.04 / Linux Mint 17 computer. Since Ubuntu packages don’t have the latest RDKit for trusty, you have to build RDKit from its source.
Install Dependencies
1sudo apt-get install flex bison build-essential python-numpy cmake python-dev sqlite3 libsqlite3-dev libboost1.54-all-dev Download the Build
Blog
How to Install Mezzanine on Ubuntu/Linux Mint [Complete Guide]
Mezzanine is a CMS application built on Django web framework. The installation steps are easy but your environment may not just suitable enough for it work without a problem. So, here I’m going to describe complete installation from scratch on a virtual environment.
First of all, install virtualenv:
$ sudo apt-get install python-virtualenv Then, create a virtual environment:
$ virtualenv testenv And, activate it: $ cd testenv $ source bin/activate
Blog
Geany Color Schemes Ubuntu
There is a collection of color schemes for Geany as well.
Download it on GitHub and follow the instructions.
You’ll need to extract and copy all the files in colorschemes directory to ~/.config/geany/colorschemes/
Then, restart Geany and go to View -> Editor -> Color Schemes and choose your style.
I’m using Tango.
Source
Blog
Install Geany 1.23 on Ubuntu
Geany is a really nice text editor for Ubuntu. I would recommend it with TreeBrowser and some interface coding are color schemes.
But you’ll need the latest version which is 1.23 for now.
To install this version you need to add PPA, also this will keep it updated when you update your system.
Execute following lines one by one:
sudo add-apt-repository ppa:geany-dev/ppa sudo apt-get update sudo apt-get install geany Then, when you start Geany you’ll see “This is Geany 1.
Blog
Install Apache2, PHP5, MySQL & phpMyAdmin on Ubuntu 12.04
First, install apache2:
sudo apt-get install apache2 Then, for it to work: sudo service apache2 restart
For custom www folder:
sudo cp /etc/apache2/sites-available/default /etc/apache2/sites-available/www gksudo gedit /etc/apache2/sites-available/www Change DocumentRoot and Directory directive to point to new location. For example, /home/user/www/
Save and see (link here clean URLs not working Laravel 4)
Make www default and disable default:
sudo a2dissite default && sudo a2ensite www sudo service apache2 restart Create new file in www
Blog
Install Perl DBI Module on Ubuntu 12.04
On Terminal, run:
sudo apt-get install libdbi-perl Source
Blog
Start Ubuntu 12.04 Bluetooth Off
On Terminal:
sudo gedit /etc/rc.local Add following before the line “exit 0”
rfkill block bluetooth Save
Source
Blog
Install Steam on Ubuntu 12.04
Download steam_latest.deb at:
http://repo.steampowered.com/steam/archive/precise/steam_latest.deb Double click and open it on Ubuntu Software Center and click Install
It’ll start Terminal and ask password for sudo because there are some packages required, enter your password and continue
Next it’ll update itself
Done
Source
Blog
Enable Hibernation for Lenovo Z500 on Ubuntu 12.04
Using Terminal add this file:
sudo gedit /etc/polkit-1/localauthority/50-local.d/com.ubuntu.enable-hibernate.pkla This:
[Re-enable hibernate by default] Identity=unix-user:* Action=org.freedesktop.upower.hibernate ResultActive=yes Save & reboot
Source
Blog
Install Spotify on Ubuntu 12.04
Start Software Sources from Dash Home
Add following in Other Sources tab:
deb http://repository.spotify.com stable non-free Close Software Sources
Add Spotify repo key on Terminal:
sudo apt-key adv –keyserver keyserver.ubuntu.com –recv-keys 94558F59 Install Spotify on Terminal:
sudo apt-get update && sudo apt-get install spotify-client Find Spotify in Dash Home
Source
Blog
Enable Software Sources in Dash Home Ubuntu 12.04
First copy the software sources desktop file to your local applications folder:
mkdir -p ~/.local/share/applications cp /usr/share/applications/software-properties-gtk.desktop ~/.local/share/applications/ Edit the file & change the line NoDisplay=true to NoDisplay=false:
gedit ~/.local/share/applications/software-properties-gtk.desktop Save, logout and login
Source
Blog
Save Brightness Settings Ubuntu 12.04 LTS
If your laptop starts with minimized or maximized brightness and you want to have a fixed default value for that do following:
Run terminal and type to get maximum brightness:
cat /sys/class/backlight/acpi_video0/max_brightness Now set the brightness as you want and run following which give you the value for current setting:
cat /sys/class/backlight/acpi_video0/brightness Edit /etc/rc.local to have that value as default after each reboot / start:
sudo gedit /etc/rc.local Add this line before exit 0:
Blog
Hotkeys (special keys) Volume/Brightness Controls Don't Work After Suspend
What seems to solve this problem on Ubuntu 12.04 LTS (Lenovo Z500):
Open this file:
sudo gedit /etc/default/grub Modify the line as this:
GRUB_CMDLINE_LINUX="noapic" Close it and run the following:
sudo update-grub Restart your computer
Source
Blog
How To Make A File or Script Executable in Ubuntu
Start terminal CTRL + Alt + T can be used (or just go to Dash Home and type Terminal):
Run this command below:
sudo chmod +x /path/to/your/file Source
Blog
Suspend Laptop When Lid Closed Ubuntu 12.04 LTS in Lenovo Z500
I guess this is a bug. Although suspend is set in Power settings, it doesn’t suspend the laptop when its lid is closed.
To solve it, I’ve found a workaround on web. Here is how you implement it:
Generate folder if it’s not present:
sudo mkdir /etc/acpi/local Set its permissions:
sudo chmod 755 /etc/acpi/local Generate the script:
sudo gedit /etc/acpi/local/lid.sh.post Copy-paste the following:
#!/bin/bash grep -q closed /proc/acpi/button/lid/*/state if [ $?
Blog
Install Ensembl API and BioPerl 1.2.3 on Your System
I’m going to work on a project that requires lots of queries on Ensembl databases so I wanted to install Ensembl API to begin with. Since it’s programmed in Perl, I will be using Perl in this project.
There is a nice tutorial on Ensembl website for API installation. Here I will describe some steps.
1. Download the API and BioPerl
Go to Ensembl FTP ftp://ftp.ensembl.org/pub/ and download “ensembl-api.tar.gz” or click here
Tag: 2d image
Blog
Generating 2D Images of Molecules from MOL Files using Open Babel
Open Babel is a tool to work with molecular data in any way from converting one type to another, analyzing, molecular modeling, etc. It also has a method to convert MOL files into SVG or PNG images to represent them as 2D images.
Install Open Babel in Linux as following or go to their page for different operating systems
1sudo apt-get install openbabel Open Babel uses the same command to generate SVG or PNG and recognizes the file format using the given filename to as the output option -O.
Tag: 2d molecule
Blog
Generating 2D Images of Molecules from MOL Files using Open Babel
Open Babel is a tool to work with molecular data in any way from converting one type to another, analyzing, molecular modeling, etc. It also has a method to convert MOL files into SVG or PNG images to represent them as 2D images.
Install Open Babel in Linux as following or go to their page for different operating systems
1sudo apt-get install openbabel Open Babel uses the same command to generate SVG or PNG and recognizes the file format using the given filename to as the output option -O.
Tag: mol file
Blog
Generating 2D Images of Molecules from MOL Files using Open Babel
Open Babel is a tool to work with molecular data in any way from converting one type to another, analyzing, molecular modeling, etc. It also has a method to convert MOL files into SVG or PNG images to represent them as 2D images.
Install Open Babel in Linux as following or go to their page for different operating systems
1sudo apt-get install openbabel Open Babel uses the same command to generate SVG or PNG and recognizes the file format using the given filename to as the output option -O.
Tag: molecule image
Blog
Generating 2D Images of Molecules from MOL Files using Open Babel
Open Babel is a tool to work with molecular data in any way from converting one type to another, analyzing, molecular modeling, etc. It also has a method to convert MOL files into SVG or PNG images to represent them as 2D images.
Install Open Babel in Linux as following or go to their page for different operating systems
1sudo apt-get install openbabel Open Babel uses the same command to generate SVG or PNG and recognizes the file format using the given filename to as the output option -O.
Tag: open babel
Blog
Generating 2D Images of Molecules from MOL Files using Open Babel
Open Babel is a tool to work with molecular data in any way from converting one type to another, analyzing, molecular modeling, etc. It also has a method to convert MOL files into SVG or PNG images to represent them as 2D images.
Install Open Babel in Linux as following or go to their page for different operating systems
1sudo apt-get install openbabel Open Babel uses the same command to generate SVG or PNG and recognizes the file format using the given filename to as the output option -O.
Blog
Obtaining Molecule Description using Open Babel / PyBel
Open Babel is a great tool to analyze and investigate molecular data (.MOL, .SDF files). Its Python API is particularly very nice if you are familiar with Python already. In this post, I’ll demonstrate how you can obtain molecule description such as molecular weight, HBA, HBD, logP, formula, number of chiral centers using PyBel.
Installation
1$ sudo apt-get install openbabel python-openbabel Usage for MW, HBA, HBD, logP
After reading .MOL file, we need to use calcdesc method with descnames argument for getting the descriptions.
Tag: pipe
Blog
Simple Way of Python's subprocess.Popen with a Timeout Option
subprocess module in Python provides us a variety of methods to start a process from a Python script. We may use these methods to run an external commands / programs, collect their output and manage them. An example use of it might be as following:
1from subprocess import Popen, PIPE 2 3 4p = Popen(['ls', '-l'], stdout=PIPE, stderr=PIPE) 5stdout, stderr = p.communicate() 6print stdout, stderr These lines can be used to run ls -l command in Terminal and collect the output (standard output and standard error) in stdout and stderr variables using communicate method defined in the process.
Blog
SRS'de Coklu Arama Yapmak
Inceleme yapan scriptin en son hali, oncekilere gore daha fazla okuma inceliyor oldugu icin her okuma icin SRS uzerinde isim aramak oldukca zaman alan bir islemdi. Oyle ki, son inceleme 4 gun surdu.
Bunu azaltmak icin inceleme scriptini tamamen degistirdim. Oncelikle her zaman oldugu gibi esik degerini gecenleri aliyor ama direkt bunlarin ID numaralarini bir dizide (array) listeliyorum. Daha sonra bu listenin herbir elemanini boru karakteri ile ayirarak bir string haline getiriyorum.
Tag: popen
Blog
Simple Way of Python's subprocess.Popen with a Timeout Option
subprocess module in Python provides us a variety of methods to start a process from a Python script. We may use these methods to run an external commands / programs, collect their output and manage them. An example use of it might be as following:
1from subprocess import Popen, PIPE 2 3 4p = Popen(['ls', '-l'], stdout=PIPE, stderr=PIPE) 5stdout, stderr = p.communicate() 6print stdout, stderr These lines can be used to run ls -l command in Terminal and collect the output (standard output and standard error) in stdout and stderr variables using communicate method defined in the process.
Tag: popen with timeout
Blog
Simple Way of Python's subprocess.Popen with a Timeout Option
subprocess module in Python provides us a variety of methods to start a process from a Python script. We may use these methods to run an external commands / programs, collect their output and manage them. An example use of it might be as following:
1from subprocess import Popen, PIPE 2 3 4p = Popen(['ls', '-l'], stdout=PIPE, stderr=PIPE) 5stdout, stderr = p.communicate() 6print stdout, stderr These lines can be used to run ls -l command in Terminal and collect the output (standard output and standard error) in stdout and stderr variables using communicate method defined in the process.
Tag: subprocess
Blog
Simple Way of Python's subprocess.Popen with a Timeout Option
subprocess module in Python provides us a variety of methods to start a process from a Python script. We may use these methods to run an external commands / programs, collect their output and manage them. An example use of it might be as following:
1from subprocess import Popen, PIPE 2 3 4p = Popen(['ls', '-l'], stdout=PIPE, stderr=PIPE) 5stdout, stderr = p.communicate() 6print stdout, stderr These lines can be used to run ls -l command in Terminal and collect the output (standard output and standard error) in stdout and stderr variables using communicate method defined in the process.
Tag: loadbalancer
Blog
Running StarCluster Load Balancer in Background in Linux
StarCluster loadbalancer command is regularly monitors the jobs in queue and it adds or removes nodes to the master node that is created beforehand to effectively complete the queue.
To run in in the background without killing it when the terminal closed:
nohup starcluster loadbalance cluster_name >loadbalance.log 2>&1 & or to keep standard output and standard error logs separate:
nohup starcluster loadbalance cluster_name > loadbalance.access.log 2> loadbalance.error.log & This will start the process and output the process ID (PID) which can be used to check or kill it.
Tag: nohup
Blog
Running StarCluster Load Balancer in Background in Linux
StarCluster loadbalancer command is regularly monitors the jobs in queue and it adds or removes nodes to the master node that is created beforehand to effectively complete the queue.
To run in in the background without killing it when the terminal closed:
nohup starcluster loadbalance cluster_name >loadbalance.log 2>&1 & or to keep standard output and standard error logs separate:
nohup starcluster loadbalance cluster_name > loadbalance.access.log 2> loadbalance.error.log & This will start the process and output the process ID (PID) which can be used to check or kill it.
Tag: run in background
Blog
Running StarCluster Load Balancer in Background in Linux
StarCluster loadbalancer command is regularly monitors the jobs in queue and it adds or removes nodes to the master node that is created beforehand to effectively complete the queue.
To run in in the background without killing it when the terminal closed:
nohup starcluster loadbalance cluster_name >loadbalance.log 2>&1 & or to keep standard output and standard error logs separate:
nohup starcluster loadbalance cluster_name > loadbalance.access.log 2> loadbalance.error.log & This will start the process and output the process ID (PID) which can be used to check or kill it.
Tag: starcluster
Blog
Running StarCluster Load Balancer in Background in Linux
StarCluster loadbalancer command is regularly monitors the jobs in queue and it adds or removes nodes to the master node that is created beforehand to effectively complete the queue.
To run in in the background without killing it when the terminal closed:
nohup starcluster loadbalance cluster_name >loadbalance.log 2>&1 & or to keep standard output and standard error logs separate:
nohup starcluster loadbalance cluster_name > loadbalance.access.log 2> loadbalance.error.log & This will start the process and output the process ID (PID) which can be used to check or kill it.
Blog
Running Script on Cluster (StarCluster)
Start a new cluster with the configuration file you modified:
starcluster start cluster_name Send the script to the running cluster:
starcluster put cluster_name myscr.csh /home/myscr.csh Run it using source:
starcluster sshmaster cluster_name "source /home/myscr.csh >& /home/myscr.log"
Tag: apache2
Blog
Change Apache’s Default User www-data or Home Directory /var/www/
I was getting errors from StarCluster run due to not being able to find .starcluster directory in /var/www/.
This directory has config file and log directories for StarCluster so without it, it can’t run.
To solve the issue, I set up my own user in Apache envvars instead of www-data which also changes default home directory to mine.
Edit following file with super user permissions:
sudo nano /etc/apache2/envvars Enter your username to following lines and save:
Blog
Getting Started with Your AWS Instance and Installing and Setting Up an Apache Server
Update and upgrade packages:
sudo apt-get update sudo apt-get upgrade Install Apache server:
sudo apt-get install apache2 Set up a root folder in home folder and create an index file for testing:
mkdir ~/www echo ‘Hello, World!’ > ~/www/index.html Set up your virtual host:
sudo cp /etc/apache2/sites-available/000-default.conf /etc/apache2/sites-available/000-www.conf sudo nano /etc/apache2/sites-available/000-www.conf Modify DocumentRoot to point your “www” folder in home folder (e.g. /home/ubuntu/www)
And add following lines after DocumentRoot line:
Blog
Install Apache2, PHP5, MySQL & phpMyAdmin on Ubuntu 12.04
First, install apache2:
sudo apt-get install apache2 Then, for it to work: sudo service apache2 restart
For custom www folder:
sudo cp /etc/apache2/sites-available/default /etc/apache2/sites-available/www gksudo gedit /etc/apache2/sites-available/www Change DocumentRoot and Directory directive to point to new location. For example, /home/user/www/
Save and see (link here clean URLs not working Laravel 4)
Make www default and disable default:
sudo a2dissite default && sudo a2ensite www sudo service apache2 restart Create new file in www
Tag: default home directory
Blog
Change Apache’s Default User www-data or Home Directory /var/www/
I was getting errors from StarCluster run due to not being able to find .starcluster directory in /var/www/.
This directory has config file and log directories for StarCluster so without it, it can’t run.
To solve the issue, I set up my own user in Apache envvars instead of www-data which also changes default home directory to mine.
Edit following file with super user permissions:
sudo nano /etc/apache2/envvars Enter your username to following lines and save:
Tag: default user
Blog
Change Apache’s Default User www-data or Home Directory /var/www/
I was getting errors from StarCluster run due to not being able to find .starcluster directory in /var/www/.
This directory has config file and log directories for StarCluster so without it, it can’t run.
To solve the issue, I set up my own user in Apache envvars instead of www-data which also changes default home directory to mine.
Edit following file with super user permissions:
sudo nano /etc/apache2/envvars Enter your username to following lines and save:
Tag: amazon
Blog
Transfer Files to Your AWS S3 Storage in Linux
Uploading files to an AWS S3 storage can be difficult through the GUI with many files included or if your files are in a server where you don’t have a GUI option. Use following tool to transfer files to an S3 bucket.
Download following tool and install:
cd ~/Downloads git clone https://github.com/s3tools/s3cmd.git cd s3cmd/ sudo python setup.py install Next, execute following to create a configuration file to connect to your AWS S3 account:
Blog
Running Script on Cluster (StarCluster)
Start a new cluster with the configuration file you modified:
starcluster start cluster_name Send the script to the running cluster:
starcluster put cluster_name myscr.csh /home/myscr.csh Run it using source:
starcluster sshmaster cluster_name "source /home/myscr.csh >& /home/myscr.log"
Blog
Uploading Files to AWS using SSH/SCP
Here is a small command for uploading files to AWS through SSH’s command scp (secure copy).
scp -i path/to/your/key-pairs/file path/to/file/you/want/to/upload ubuntu@PUBLIC_DNS:path/to/the/destination
Blog
Configuring Mezzanine for Apache server & mod_wsgi in AWS
Install [Mezzanine]({% post_url 2015-05-01-how-to-install-mezzanine-on-ubuntulinux-mint %}), [Apache server]({% post_url 2015-05-08-getting-started-with-your-aws-instance-and %}) and mod_wsgi:
sudo apt-get install libapache2-mod-wsgi sudo a2enmod wsgi Set up a MySQL database for your Mezzanine project
Read [my post on how to set up a MySQL database for a Mezzanine project]({% post_url 2015-05-09-how-to-set-up-a-mysql-database-for-a-mezzanine %})
Collect static files:
python manage.py collectstatic Configure your Apache server configuration for the project like following:
WSGIPythonPath /home/ubuntu/www/mezzanine-project <VirtualHost *:80> #ServerName example.com ServerAdmin admin@example.com DocumentRoot /home/ubuntu/www/mezzanine-project WSGIScriptAlias / /home/ubuntu/www/mezzanine-project/wsgi.
Blog
Setting Up Mezzanine Projects in AWS
Go to EC2 management console, Security groups and add a Custom TCP inbound rule with port 8000. Select “Anywhere” from the list.
Then follow [this to install Mezzanine]({% post_url 2015-05-01-how-to-install-mezzanine-on-ubuntulinux-mint %})
Above tutorial is also explains setting up a site record. Mezzanine default site record is 127.0.0.1:8000 which should be 0.0.0.0:8000 in our case. So, enter 0.0.0.0:8000 when you’re asked to enter a site record when you ru
python manage.py createdb Also, you might still need to provide this site record while running the development server:
Blog
Getting Started with Your AWS Instance and Installing and Setting Up an Apache Server
Update and upgrade packages:
sudo apt-get update sudo apt-get upgrade Install Apache server:
sudo apt-get install apache2 Set up a root folder in home folder and create an index file for testing:
mkdir ~/www echo ‘Hello, World!’ > ~/www/index.html Set up your virtual host:
sudo cp /etc/apache2/sites-available/000-default.conf /etc/apache2/sites-available/000-www.conf sudo nano /etc/apache2/sites-available/000-www.conf Modify DocumentRoot to point your “www” folder in home folder (e.g. /home/ubuntu/www)
And add following lines after DocumentRoot line:
Blog
AWS Start an Instance and Connect to it
Go to EC2 management console
Create a new key-pair if necessary and download it
Launch an instance
Add HTTP security group for web applications over HTTP
Get public DNS
Change permissions on key-pair file:
1chmod 400 path/to/your/file.pem Connect:
1ssh -i path/to/your/file.pem ubuntu@PUBLIC_DNS Note: ubuntu is for connecting an Ubuntu 64 bit instance. It’s different for others
Tag: aws
Blog
Transfer Files to Your AWS S3 Storage in Linux
Uploading files to an AWS S3 storage can be difficult through the GUI with many files included or if your files are in a server where you don’t have a GUI option. Use following tool to transfer files to an S3 bucket.
Download following tool and install:
cd ~/Downloads git clone https://github.com/s3tools/s3cmd.git cd s3cmd/ sudo python setup.py install Next, execute following to create a configuration file to connect to your AWS S3 account:
Blog
Running Script on Cluster (StarCluster)
Start a new cluster with the configuration file you modified:
starcluster start cluster_name Send the script to the running cluster:
starcluster put cluster_name myscr.csh /home/myscr.csh Run it using source:
starcluster sshmaster cluster_name "source /home/myscr.csh >& /home/myscr.log"
Blog
Uploading Files to AWS using SSH/SCP
Here is a small command for uploading files to AWS through SSH’s command scp (secure copy).
scp -i path/to/your/key-pairs/file path/to/file/you/want/to/upload ubuntu@PUBLIC_DNS:path/to/the/destination
Blog
Configuring Mezzanine for Apache server & mod_wsgi in AWS
Install [Mezzanine]({% post_url 2015-05-01-how-to-install-mezzanine-on-ubuntulinux-mint %}), [Apache server]({% post_url 2015-05-08-getting-started-with-your-aws-instance-and %}) and mod_wsgi:
sudo apt-get install libapache2-mod-wsgi sudo a2enmod wsgi Set up a MySQL database for your Mezzanine project
Read [my post on how to set up a MySQL database for a Mezzanine project]({% post_url 2015-05-09-how-to-set-up-a-mysql-database-for-a-mezzanine %})
Collect static files:
python manage.py collectstatic Configure your Apache server configuration for the project like following:
WSGIPythonPath /home/ubuntu/www/mezzanine-project <VirtualHost *:80> #ServerName example.com ServerAdmin admin@example.com DocumentRoot /home/ubuntu/www/mezzanine-project WSGIScriptAlias / /home/ubuntu/www/mezzanine-project/wsgi.
Blog
Setting Up Mezzanine Projects in AWS
Go to EC2 management console, Security groups and add a Custom TCP inbound rule with port 8000. Select “Anywhere” from the list.
Then follow [this to install Mezzanine]({% post_url 2015-05-01-how-to-install-mezzanine-on-ubuntulinux-mint %})
Above tutorial is also explains setting up a site record. Mezzanine default site record is 127.0.0.1:8000 which should be 0.0.0.0:8000 in our case. So, enter 0.0.0.0:8000 when you’re asked to enter a site record when you ru
python manage.py createdb Also, you might still need to provide this site record while running the development server:
Blog
Getting Started with Your AWS Instance and Installing and Setting Up an Apache Server
Update and upgrade packages:
sudo apt-get update sudo apt-get upgrade Install Apache server:
sudo apt-get install apache2 Set up a root folder in home folder and create an index file for testing:
mkdir ~/www echo ‘Hello, World!’ > ~/www/index.html Set up your virtual host:
sudo cp /etc/apache2/sites-available/000-default.conf /etc/apache2/sites-available/000-www.conf sudo nano /etc/apache2/sites-available/000-www.conf Modify DocumentRoot to point your “www” folder in home folder (e.g. /home/ubuntu/www)
And add following lines after DocumentRoot line:
Blog
AWS Start an Instance and Connect to it
Go to EC2 management console
Create a new key-pair if necessary and download it
Launch an instance
Add HTTP security group for web applications over HTTP
Get public DNS
Change permissions on key-pair file:
1chmod 400 path/to/your/file.pem Connect:
1ssh -i path/to/your/file.pem ubuntu@PUBLIC_DNS Note: ubuntu is for connecting an Ubuntu 64 bit instance. It’s different for others
Tag: aws s3
Blog
Transfer Files to Your AWS S3 Storage in Linux
Uploading files to an AWS S3 storage can be difficult through the GUI with many files included or if your files are in a server where you don’t have a GUI option. Use following tool to transfer files to an S3 bucket.
Download following tool and install:
cd ~/Downloads git clone https://github.com/s3tools/s3cmd.git cd s3cmd/ sudo python setup.py install Next, execute following to create a configuration file to connect to your AWS S3 account:
Tag: s3cmd
Blog
Transfer Files to Your AWS S3 Storage in Linux
Uploading files to an AWS S3 storage can be difficult through the GUI with many files included or if your files are in a server where you don’t have a GUI option. Use following tool to transfer files to an S3 bucket.
Download following tool and install:
cd ~/Downloads git clone https://github.com/s3tools/s3cmd.git cd s3cmd/ sudo python setup.py install Next, execute following to create a configuration file to connect to your AWS S3 account:
Tag: django
Blog
ImportError: Reportlab Version 2.1+ is needed
Little bug in xhtml2pdf version 0.0.5. To fix:
$ sudo nano /usr/local/lib/python2.7/dist-packages/xhtml2pdf/util.py Change the following lines:
1if not (reportlab.Version[0] == "2" and reportlab.Version[2] >= "1"): 2 raise ImportError("Reportlab Version 2.1+ is needed!") 3 4REPORTLAB22 = (reportlab.Version[0] == "2" and reportlab.Version[2] >= "2") With these lines:
1if not (reportlab.Version[:3] >= "2.1"): 2 raise ImportError("Reportlab Version 2.1+ is needed!") 3 4REPORTLAB22 = (reportlab.Version[:3] >= "2.1")
Blog
Django Migrations Table Already Exists Fix
Fix this issue by faking the migrations:
python manage.py migrate –fake <appname> Taken from this SO answer
Blog
Mezzanine BS Banners Translation with django-modeltranslation
Mezzanine BS Banners is a nice app for implementing Bootstrap 3 banners/sliders to your Mezzanine projects. The Banners model in BS Banners app has a title and its stacked inline Slides model has title and content for translation.
After [installing and setting up Django/Mezzanine translations]({% post_url 2015-07-01-djangomezzanine-content-translation-for-mezzanine %}):
Create a translation.py inside your Mezzanine project or your custom theme/skin application and copy/paste following lines:
1from modeltranslation.translator import translator 2from mezzanine.core.translation import TranslatedSlugged, TranslatedRichText 3from mezzanine_bsbanners.
Blog
Django/Mezzanine Content Translation for Mezzanine Built-in Applications
As Mezzanine comes with additional Django applications such as pages, galleries and to translate their content, Mezzanine supports django-modeltranslation integration.
Install django-modeltranslation:
pip install django-modeltranslation Add following to the INSTALLED_APPS in settings.py:
1"modeltranslation", And following in settings.py:
1USE_MODELTRANSLATION = True Also, move mezzanine.pages to the top of other Mezzanine apps in INSTALLED_APPS in settings.py like so:
1"mezzanine.pages", 2"mezzanine.boot", 3"mezzanine.conf", 4"mezzanine.core", 5"mezzanine.generic", 6"mezzanine.blog", 7"mezzanine.forms", 8"mezzanine.galleries", 9"mezzanine.twitter", 10"mezzanine.accounts", 11"mezzanine.mobile", Run following to create fields in database tables for translations:
Blog
Setting Up Templates and Python Scripts for Translation
Templates need following template tag:
1{% raw %}{% load i18n %}{% endraw %} Then, wrapping any text with
1{% raw %}{% trans "TEXT" %}{% endraw %} will make it translatable via Rosetta Django application
In Python scripts, you need to import following library:
from django.utils.translation import ugettext_lazy as _ Then wrapping any text with
1_('TEXT') will make it translatable.
Blog
Django Rosetta Translations for Django Applications
Make a directory called locale/ under the application directory:
cd app_name mkdir locale Add the folder in LOCAL_PATHS dictionary in settings.py:
1LOCALE_PATHS = ( 2 os.path.join(PROJECT_ROOT, 'app_name', 'locale/'), 3) Run the following command to create PO translation file for the application:
python ../manage.py makemessages -l tr -e html,py,txt python ../manage.py compilemessages Option -l is for language, it should match your definition in settings.py:
1LANGUAGES = ( 2 ('en' _('English')), 3 ('tr' _('Turkish')), 4 ('it' _('Italian')), 5) Repeat the last step for all languages and the go to Rosetta URL to translate.
Blog
Django Rosetta Installation
Install SciPy:
sudo apt-get install python-numpy python-scipy python-matplotlib ipython ipython-notebook python-pandas python-sympy python-nose Install pymongo and nltk:
sudo pip install pymongo sudo pip install nltk Install Python MySQLdb:
sudo apt-get install python-mysqldb Install Rosetta:
sudo pip install django-rosetta Add following into INSTALLED_APPS in settings.py:
1"rosetta", Add following into urls.py:
url(r’^translations/’, include(‘rosetta.urls’)), To also allow language prefixes , change patters to i18n_patterns in urls.py:
1urlpatterns += i18n_patterns( 2 ... 3)
Blog
Errno 13 Permission denied Django File Uploads
Run following command to give www-data permissions to static folder and all its content:
cd path/to/your/django/project sudo chown -R www-data:www-data static/ Do this in your production server
Blog
Configuring Mezzanine for Apache server & mod_wsgi in AWS
Install [Mezzanine]({% post_url 2015-05-01-how-to-install-mezzanine-on-ubuntulinux-mint %}), [Apache server]({% post_url 2015-05-08-getting-started-with-your-aws-instance-and %}) and mod_wsgi:
sudo apt-get install libapache2-mod-wsgi sudo a2enmod wsgi Set up a MySQL database for your Mezzanine project
Read [my post on how to set up a MySQL database for a Mezzanine project]({% post_url 2015-05-09-how-to-set-up-a-mysql-database-for-a-mezzanine %})
Collect static files:
python manage.py collectstatic Configure your Apache server configuration for the project like following:
WSGIPythonPath /home/ubuntu/www/mezzanine-project <VirtualHost *:80> #ServerName example.com ServerAdmin admin@example.com DocumentRoot /home/ubuntu/www/mezzanine-project WSGIScriptAlias / /home/ubuntu/www/mezzanine-project/wsgi.
Blog
How to Set Up a MySQL Database for a Mezzanine Project
Install MySQL server and python-mysqldb package:
sudo apt-get install mysql-server sudo apt-get install python-mysqldb Run MySQL:
mysql -u root -p Create a database:
mysql> create database mezzanine_project; Confirm it:
mysql> show databases; Exit:
mysql> exit Configure local_settings.py:
cd path/to/your/mezzanine/projectnano local_settings.py Like following:
1DATABASES = { 2 "default": { 3 "ENGINE": "django.db.backends.mysql", 4 "NAME": "mezzanine_project", 5 "USER": "root", 6 "PASSWORD": "123456", 7 "HOST": "", 8 "PORT": "", 9 } 10 } Note: Replace your password
Blog
How to Install Mezzanine on Ubuntu/Linux Mint [Complete Guide]
Mezzanine is a CMS application built on Django web framework. The installation steps are easy but your environment may not just suitable enough for it work without a problem. So, here I’m going to describe complete installation from scratch on a virtual environment.
First of all, install virtualenv:
$ sudo apt-get install python-virtualenv Then, create a virtual environment:
$ virtualenv testenv And, activate it: $ cd testenv $ source bin/activate
Blog
How to Clear (or Drop) DB Table of A Django App
Let’s say you created a Django app and ran python manage.py syncdb and created its table. Everytime you make a change in the table, you’ll need to drop that table and run python manage.py syncdb again to update. And how you drop a table of a Django app:
$ python manage.py sqlclear app_name | python manage.py dbshell Drop tables of an app with migrations (Django >= 1.8):
$ python manage.py migrate appname zero Recreate all the tables:
Tag: django easy pdf
Blog
ImportError: Reportlab Version 2.1+ is needed
Little bug in xhtml2pdf version 0.0.5. To fix:
$ sudo nano /usr/local/lib/python2.7/dist-packages/xhtml2pdf/util.py Change the following lines:
1if not (reportlab.Version[0] == "2" and reportlab.Version[2] >= "1"): 2 raise ImportError("Reportlab Version 2.1+ is needed!") 3 4REPORTLAB22 = (reportlab.Version[0] == "2" and reportlab.Version[2] >= "2") With these lines:
1if not (reportlab.Version[:3] >= "2.1"): 2 raise ImportError("Reportlab Version 2.1+ is needed!") 3 4REPORTLAB22 = (reportlab.Version[:3] >= "2.1")
Tag: mezzanine
Blog
ImportError: Reportlab Version 2.1+ is needed
Little bug in xhtml2pdf version 0.0.5. To fix:
$ sudo nano /usr/local/lib/python2.7/dist-packages/xhtml2pdf/util.py Change the following lines:
1if not (reportlab.Version[0] == "2" and reportlab.Version[2] >= "1"): 2 raise ImportError("Reportlab Version 2.1+ is needed!") 3 4REPORTLAB22 = (reportlab.Version[0] == "2" and reportlab.Version[2] >= "2") With these lines:
1if not (reportlab.Version[:3] >= "2.1"): 2 raise ImportError("Reportlab Version 2.1+ is needed!") 3 4REPORTLAB22 = (reportlab.Version[:3] >= "2.1")
Blog
Mezzanine BS Banners Translation with django-modeltranslation
Mezzanine BS Banners is a nice app for implementing Bootstrap 3 banners/sliders to your Mezzanine projects. The Banners model in BS Banners app has a title and its stacked inline Slides model has title and content for translation.
After [installing and setting up Django/Mezzanine translations]({% post_url 2015-07-01-djangomezzanine-content-translation-for-mezzanine %}):
Create a translation.py inside your Mezzanine project or your custom theme/skin application and copy/paste following lines:
1from modeltranslation.translator import translator 2from mezzanine.core.translation import TranslatedSlugged, TranslatedRichText 3from mezzanine_bsbanners.
Blog
Django/Mezzanine Content Translation for Mezzanine Built-in Applications
As Mezzanine comes with additional Django applications such as pages, galleries and to translate their content, Mezzanine supports django-modeltranslation integration.
Install django-modeltranslation:
pip install django-modeltranslation Add following to the INSTALLED_APPS in settings.py:
1"modeltranslation", And following in settings.py:
1USE_MODELTRANSLATION = True Also, move mezzanine.pages to the top of other Mezzanine apps in INSTALLED_APPS in settings.py like so:
1"mezzanine.pages", 2"mezzanine.boot", 3"mezzanine.conf", 4"mezzanine.core", 5"mezzanine.generic", 6"mezzanine.blog", 7"mezzanine.forms", 8"mezzanine.galleries", 9"mezzanine.twitter", 10"mezzanine.accounts", 11"mezzanine.mobile", Run following to create fields in database tables for translations:
Blog
Configuring Mezzanine for Apache server & mod_wsgi in AWS
Install [Mezzanine]({% post_url 2015-05-01-how-to-install-mezzanine-on-ubuntulinux-mint %}), [Apache server]({% post_url 2015-05-08-getting-started-with-your-aws-instance-and %}) and mod_wsgi:
sudo apt-get install libapache2-mod-wsgi sudo a2enmod wsgi Set up a MySQL database for your Mezzanine project
Read [my post on how to set up a MySQL database for a Mezzanine project]({% post_url 2015-05-09-how-to-set-up-a-mysql-database-for-a-mezzanine %})
Collect static files:
python manage.py collectstatic Configure your Apache server configuration for the project like following:
WSGIPythonPath /home/ubuntu/www/mezzanine-project <VirtualHost *:80> #ServerName example.com ServerAdmin admin@example.com DocumentRoot /home/ubuntu/www/mezzanine-project WSGIScriptAlias / /home/ubuntu/www/mezzanine-project/wsgi.
Blog
How to Set Up a MySQL Database for a Mezzanine Project
Install MySQL server and python-mysqldb package:
sudo apt-get install mysql-server sudo apt-get install python-mysqldb Run MySQL:
mysql -u root -p Create a database:
mysql> create database mezzanine_project; Confirm it:
mysql> show databases; Exit:
mysql> exit Configure local_settings.py:
cd path/to/your/mezzanine/projectnano local_settings.py Like following:
1DATABASES = { 2 "default": { 3 "ENGINE": "django.db.backends.mysql", 4 "NAME": "mezzanine_project", 5 "USER": "root", 6 "PASSWORD": "123456", 7 "HOST": "", 8 "PORT": "", 9 } 10 } Note: Replace your password
Blog
Setting Up Mezzanine Projects in AWS
Go to EC2 management console, Security groups and add a Custom TCP inbound rule with port 8000. Select “Anywhere” from the list.
Then follow [this to install Mezzanine]({% post_url 2015-05-01-how-to-install-mezzanine-on-ubuntulinux-mint %})
Above tutorial is also explains setting up a site record. Mezzanine default site record is 127.0.0.1:8000 which should be 0.0.0.0:8000 in our case. So, enter 0.0.0.0:8000 when you’re asked to enter a site record when you ru
python manage.py createdb Also, you might still need to provide this site record while running the development server:
Blog
How to Install Mezzanine on Ubuntu/Linux Mint [Complete Guide]
Mezzanine is a CMS application built on Django web framework. The installation steps are easy but your environment may not just suitable enough for it work without a problem. So, here I’m going to describe complete installation from scratch on a virtual environment.
First of all, install virtualenv:
$ sudo apt-get install python-virtualenv Then, create a virtual environment:
$ virtualenv testenv And, activate it: $ cd testenv $ source bin/activate
Tag: reportlab
Blog
ImportError: Reportlab Version 2.1+ is needed
Little bug in xhtml2pdf version 0.0.5. To fix:
$ sudo nano /usr/local/lib/python2.7/dist-packages/xhtml2pdf/util.py Change the following lines:
1if not (reportlab.Version[0] == "2" and reportlab.Version[2] >= "1"): 2 raise ImportError("Reportlab Version 2.1+ is needed!") 3 4REPORTLAB22 = (reportlab.Version[0] == "2" and reportlab.Version[2] >= "2") With these lines:
1if not (reportlab.Version[:3] >= "2.1"): 2 raise ImportError("Reportlab Version 2.1+ is needed!") 3 4REPORTLAB22 = (reportlab.Version[:3] >= "2.1")
Tag: xhtml2pdf
Blog
ImportError: Reportlab Version 2.1+ is needed
Little bug in xhtml2pdf version 0.0.5. To fix:
$ sudo nano /usr/local/lib/python2.7/dist-packages/xhtml2pdf/util.py Change the following lines:
1if not (reportlab.Version[0] == "2" and reportlab.Version[2] >= "1"): 2 raise ImportError("Reportlab Version 2.1+ is needed!") 3 4REPORTLAB22 = (reportlab.Version[0] == "2" and reportlab.Version[2] >= "2") With these lines:
1if not (reportlab.Version[:3] >= "2.1"): 2 raise ImportError("Reportlab Version 2.1+ is needed!") 3 4REPORTLAB22 = (reportlab.Version[:3] >= "2.1")
Tag: migrations
Blog
Django Migrations Table Already Exists Fix
Fix this issue by faking the migrations:
python manage.py migrate –fake <appname> Taken from this SO answer
Tag: table already exists
Blog
Django Migrations Table Already Exists Fix
Fix this issue by faking the migrations:
python manage.py migrate –fake <appname> Taken from this SO answer
Tag: banners
Blog
Mezzanine BS Banners Translation with django-modeltranslation
Mezzanine BS Banners is a nice app for implementing Bootstrap 3 banners/sliders to your Mezzanine projects. The Banners model in BS Banners app has a title and its stacked inline Slides model has title and content for translation.
After [installing and setting up Django/Mezzanine translations]({% post_url 2015-07-01-djangomezzanine-content-translation-for-mezzanine %}):
Create a translation.py inside your Mezzanine project or your custom theme/skin application and copy/paste following lines:
1from modeltranslation.translator import translator 2from mezzanine.core.translation import TranslatedSlugged, TranslatedRichText 3from mezzanine_bsbanners.
Tag: bootstrap
Blog
Mezzanine BS Banners Translation with django-modeltranslation
Mezzanine BS Banners is a nice app for implementing Bootstrap 3 banners/sliders to your Mezzanine projects. The Banners model in BS Banners app has a title and its stacked inline Slides model has title and content for translation.
After [installing and setting up Django/Mezzanine translations]({% post_url 2015-07-01-djangomezzanine-content-translation-for-mezzanine %}):
Create a translation.py inside your Mezzanine project or your custom theme/skin application and copy/paste following lines:
1from modeltranslation.translator import translator 2from mezzanine.core.translation import TranslatedSlugged, TranslatedRichText 3from mezzanine_bsbanners.
Tag: bsbanners
Blog
Mezzanine BS Banners Translation with django-modeltranslation
Mezzanine BS Banners is a nice app for implementing Bootstrap 3 banners/sliders to your Mezzanine projects. The Banners model in BS Banners app has a title and its stacked inline Slides model has title and content for translation.
After [installing and setting up Django/Mezzanine translations]({% post_url 2015-07-01-djangomezzanine-content-translation-for-mezzanine %}):
Create a translation.py inside your Mezzanine project or your custom theme/skin application and copy/paste following lines:
1from modeltranslation.translator import translator 2from mezzanine.core.translation import TranslatedSlugged, TranslatedRichText 3from mezzanine_bsbanners.
Tag: sliders
Blog
Mezzanine BS Banners Translation with django-modeltranslation
Mezzanine BS Banners is a nice app for implementing Bootstrap 3 banners/sliders to your Mezzanine projects. The Banners model in BS Banners app has a title and its stacked inline Slides model has title and content for translation.
After [installing and setting up Django/Mezzanine translations]({% post_url 2015-07-01-djangomezzanine-content-translation-for-mezzanine %}):
Create a translation.py inside your Mezzanine project or your custom theme/skin application and copy/paste following lines:
1from modeltranslation.translator import translator 2from mezzanine.core.translation import TranslatedSlugged, TranslatedRichText 3from mezzanine_bsbanners.
Tag: translation
Blog
Django/Mezzanine Content Translation for Mezzanine Built-in Applications
As Mezzanine comes with additional Django applications such as pages, galleries and to translate their content, Mezzanine supports django-modeltranslation integration.
Install django-modeltranslation:
pip install django-modeltranslation Add following to the INSTALLED_APPS in settings.py:
1"modeltranslation", And following in settings.py:
1USE_MODELTRANSLATION = True Also, move mezzanine.pages to the top of other Mezzanine apps in INSTALLED_APPS in settings.py like so:
1"mezzanine.pages", 2"mezzanine.boot", 3"mezzanine.conf", 4"mezzanine.core", 5"mezzanine.generic", 6"mezzanine.blog", 7"mezzanine.forms", 8"mezzanine.galleries", 9"mezzanine.twitter", 10"mezzanine.accounts", 11"mezzanine.mobile", Run following to create fields in database tables for translations:
Blog
Setting Up Templates and Python Scripts for Translation
Templates need following template tag:
1{% raw %}{% load i18n %}{% endraw %} Then, wrapping any text with
1{% raw %}{% trans "TEXT" %}{% endraw %} will make it translatable via Rosetta Django application
In Python scripts, you need to import following library:
from django.utils.translation import ugettext_lazy as _ Then wrapping any text with
1_('TEXT') will make it translatable.
Blog
Django Rosetta Installation
Install SciPy:
sudo apt-get install python-numpy python-scipy python-matplotlib ipython ipython-notebook python-pandas python-sympy python-nose Install pymongo and nltk:
sudo pip install pymongo sudo pip install nltk Install Python MySQLdb:
sudo apt-get install python-mysqldb Install Rosetta:
sudo pip install django-rosetta Add following into INSTALLED_APPS in settings.py:
1"rosetta", Add following into urls.py:
url(r’^translations/’, include(‘rosetta.urls’)), To also allow language prefixes , change patters to i18n_patterns in urls.py:
1urlpatterns += i18n_patterns( 2 ... 3)
Tag: bash
Blog
Convert XLS/XLSX to CSV in Bash
In most of the modern Linux distributions, Libre Office is available and it can be used to convert XLS or XLSX file(s) to CSV file(s) in bash.
For XLS file(s):
1for i in *.xls; do libreoffice --headless --convert-to csv "$i"; done For XLSX file(s):
1for i in *.xlsx; do libreoffice --headless --convert-to csv "$i"; done You may get following warning but it still works fine:
1javaldx: Could not find a Java Runtime Environment!
Blog
Running Script on Cluster (StarCluster)
Start a new cluster with the configuration file you modified:
starcluster start cluster_name Send the script to the running cluster:
starcluster put cluster_name myscr.csh /home/myscr.csh Run it using source:
starcluster sshmaster cluster_name "source /home/myscr.csh >& /home/myscr.log"
Blog
How To Make A File or Script Executable in Ubuntu
Start terminal CTRL + Alt + T can be used (or just go to Dash Home and type Terminal):
Run this command below:
sudo chmod +x /path/to/your/file Source
Tag: csv
Blog
Convert XLS/XLSX to CSV in Bash
In most of the modern Linux distributions, Libre Office is available and it can be used to convert XLS or XLSX file(s) to CSV file(s) in bash.
For XLS file(s):
1for i in *.xls; do libreoffice --headless --convert-to csv "$i"; done For XLSX file(s):
1for i in *.xlsx; do libreoffice --headless --convert-to csv "$i"; done You may get following warning but it still works fine:
1javaldx: Could not find a Java Runtime Environment!
Blog
Data Preprocessing I for Salmon Project
Since we’ll be using R for most of the analyses, we converted XLS data file to CSV using MS Office Excel 2013 and then we had to fix several lines using Sublime Text 2 because three colums in these lines were left unquoted which later created a problem reading in RStudio.
The data contains phosphorylation data of 8553 peptides. There are many missing data points for many peptides and since IPI IDs were used for peptides and these are not supported now, we had to convert IPI IDs to HGNC approved symbols although data had these symbols as names but they looked outdated.
Blog
Playing around with CellNOptR Tool and MIDAS File
With CellNOptR, we will try to construct network models for the challenge. For this, the tool needs two inputs. First one is a special data object called CNOlist that stores vectors and matrices of data. Second one is a .SIF file that contains prior knowledge network which can be obtained from pathway database and analysis tools.
CNOlist contains following fields: namesSignals, namesCues, namesStimuli and namesInhibitors, which are vectors storing the names of measurements.
Tag: excel
Blog
Convert XLS/XLSX to CSV in Bash
In most of the modern Linux distributions, Libre Office is available and it can be used to convert XLS or XLSX file(s) to CSV file(s) in bash.
For XLS file(s):
1for i in *.xls; do libreoffice --headless --convert-to csv "$i"; done For XLSX file(s):
1for i in *.xlsx; do libreoffice --headless --convert-to csv "$i"; done You may get following warning but it still works fine:
1javaldx: Could not find a Java Runtime Environment!
Tag: libre office
Blog
Convert XLS/XLSX to CSV in Bash
In most of the modern Linux distributions, Libre Office is available and it can be used to convert XLS or XLSX file(s) to CSV file(s) in bash.
For XLS file(s):
1for i in *.xls; do libreoffice --headless --convert-to csv "$i"; done For XLSX file(s):
1for i in *.xlsx; do libreoffice --headless --convert-to csv "$i"; done You may get following warning but it still works fine:
1javaldx: Could not find a Java Runtime Environment!
Tag: xls
Blog
Convert XLS/XLSX to CSV in Bash
In most of the modern Linux distributions, Libre Office is available and it can be used to convert XLS or XLSX file(s) to CSV file(s) in bash.
For XLS file(s):
1for i in *.xls; do libreoffice --headless --convert-to csv "$i"; done For XLSX file(s):
1for i in *.xlsx; do libreoffice --headless --convert-to csv "$i"; done You may get following warning but it still works fine:
1javaldx: Could not find a Java Runtime Environment!
Blog
Salmonella Data Preprocessing for PCSF Algorithm
This post describes data preprocessing in Salmonella project for Prize-Collecting Steiner Forest Problem (PCSF) algorithm.
Salmonella data taken from Table S6 in Phosphoproteomic Analysis of Salmonella-Infected Cells Identifies Key Kinase Regulators and SopB-Dependent Host Phosphorylation Events by Rogers, LD et al. has been converted to tab delimited TXT file from its original XLS file for easy reading in Python.
The data should be separated into time points files (2, 5, 10 and 20 minutes) each of which will contain corresponding phophoproteins and their fold changes.
Blog
Data Preprocessing I for Salmon Project
Since we’ll be using R for most of the analyses, we converted XLS data file to CSV using MS Office Excel 2013 and then we had to fix several lines using Sublime Text 2 because three colums in these lines were left unquoted which later created a problem reading in RStudio.
The data contains phosphorylation data of 8553 peptides. There are many missing data points for many peptides and since IPI IDs were used for peptides and these are not supported now, we had to convert IPI IDs to HGNC approved symbols although data had these symbols as names but they looked outdated.
Tag: xls to csv
Blog
Convert XLS/XLSX to CSV in Bash
In most of the modern Linux distributions, Libre Office is available and it can be used to convert XLS or XLSX file(s) to CSV file(s) in bash.
For XLS file(s):
1for i in *.xls; do libreoffice --headless --convert-to csv "$i"; done For XLSX file(s):
1for i in *.xlsx; do libreoffice --headless --convert-to csv "$i"; done You may get following warning but it still works fine:
1javaldx: Could not find a Java Runtime Environment!
Tag: xlsx
Blog
Convert XLS/XLSX to CSV in Bash
In most of the modern Linux distributions, Libre Office is available and it can be used to convert XLS or XLSX file(s) to CSV file(s) in bash.
For XLS file(s):
1for i in *.xls; do libreoffice --headless --convert-to csv "$i"; done For XLSX file(s):
1for i in *.xlsx; do libreoffice --headless --convert-to csv "$i"; done You may get following warning but it still works fine:
1javaldx: Could not find a Java Runtime Environment!
Tag: xlsx to csv
Blog
Convert XLS/XLSX to CSV in Bash
In most of the modern Linux distributions, Libre Office is available and it can be used to convert XLS or XLSX file(s) to CSV file(s) in bash.
For XLS file(s):
1for i in *.xls; do libreoffice --headless --convert-to csv "$i"; done For XLSX file(s):
1for i in *.xlsx; do libreoffice --headless --convert-to csv "$i"; done You may get following warning but it still works fine:
1javaldx: Could not find a Java Runtime Environment!
Tag: python scripts
Blog
Setting Up Templates and Python Scripts for Translation
Templates need following template tag:
1{% raw %}{% load i18n %}{% endraw %} Then, wrapping any text with
1{% raw %}{% trans "TEXT" %}{% endraw %} will make it translatable via Rosetta Django application
In Python scripts, you need to import following library:
from django.utils.translation import ugettext_lazy as _ Then wrapping any text with
1_('TEXT') will make it translatable.
Tag: rosetta
Blog
Setting Up Templates and Python Scripts for Translation
Templates need following template tag:
1{% raw %}{% load i18n %}{% endraw %} Then, wrapping any text with
1{% raw %}{% trans "TEXT" %}{% endraw %} will make it translatable via Rosetta Django application
In Python scripts, you need to import following library:
from django.utils.translation import ugettext_lazy as _ Then wrapping any text with
1_('TEXT') will make it translatable.
Blog
Django Rosetta Translations for Django Applications
Make a directory called locale/ under the application directory:
cd app_name mkdir locale Add the folder in LOCAL_PATHS dictionary in settings.py:
1LOCALE_PATHS = ( 2 os.path.join(PROJECT_ROOT, 'app_name', 'locale/'), 3) Run the following command to create PO translation file for the application:
python ../manage.py makemessages -l tr -e html,py,txt python ../manage.py compilemessages Option -l is for language, it should match your definition in settings.py:
1LANGUAGES = ( 2 ('en' _('English')), 3 ('tr' _('Turkish')), 4 ('it' _('Italian')), 5) Repeat the last step for all languages and the go to Rosetta URL to translate.
Blog
Django Rosetta Installation
Install SciPy:
sudo apt-get install python-numpy python-scipy python-matplotlib ipython ipython-notebook python-pandas python-sympy python-nose Install pymongo and nltk:
sudo pip install pymongo sudo pip install nltk Install Python MySQLdb:
sudo apt-get install python-mysqldb Install Rosetta:
sudo pip install django-rosetta Add following into INSTALLED_APPS in settings.py:
1"rosetta", Add following into urls.py:
url(r’^translations/’, include(‘rosetta.urls’)), To also allow language prefixes , change patters to i18n_patterns in urls.py:
1urlpatterns += i18n_patterns( 2 ... 3)
Tag: templates
Blog
Setting Up Templates and Python Scripts for Translation
Templates need following template tag:
1{% raw %}{% load i18n %}{% endraw %} Then, wrapping any text with
1{% raw %}{% trans "TEXT" %}{% endraw %} will make it translatable via Rosetta Django application
In Python scripts, you need to import following library:
from django.utils.translation import ugettext_lazy as _ Then wrapping any text with
1_('TEXT') will make it translatable.
Tag: django translation
Blog
Django Rosetta Translations for Django Applications
Make a directory called locale/ under the application directory:
cd app_name mkdir locale Add the folder in LOCAL_PATHS dictionary in settings.py:
1LOCALE_PATHS = ( 2 os.path.join(PROJECT_ROOT, 'app_name', 'locale/'), 3) Run the following command to create PO translation file for the application:
python ../manage.py makemessages -l tr -e html,py,txt python ../manage.py compilemessages Option -l is for language, it should match your definition in settings.py:
1LANGUAGES = ( 2 ('en' _('English')), 3 ('tr' _('Turkish')), 4 ('it' _('Italian')), 5) Repeat the last step for all languages and the go to Rosetta URL to translate.
Tag: django-rosetta
Blog
Django Rosetta Translations for Django Applications
Make a directory called locale/ under the application directory:
cd app_name mkdir locale Add the folder in LOCAL_PATHS dictionary in settings.py:
1LOCALE_PATHS = ( 2 os.path.join(PROJECT_ROOT, 'app_name', 'locale/'), 3) Run the following command to create PO translation file for the application:
python ../manage.py makemessages -l tr -e html,py,txt python ../manage.py compilemessages Option -l is for language, it should match your definition in settings.py:
1LANGUAGES = ( 2 ('en' _('English')), 3 ('tr' _('Turkish')), 4 ('it' _('Italian')), 5) Repeat the last step for all languages and the go to Rosetta URL to translate.
Tag: localization
Blog
Django Rosetta Installation
Install SciPy:
sudo apt-get install python-numpy python-scipy python-matplotlib ipython ipython-notebook python-pandas python-sympy python-nose Install pymongo and nltk:
sudo pip install pymongo sudo pip install nltk Install Python MySQLdb:
sudo apt-get install python-mysqldb Install Rosetta:
sudo pip install django-rosetta Add following into INSTALLED_APPS in settings.py:
1"rosetta", Add following into urls.py:
url(r’^translations/’, include(‘rosetta.urls’)), To also allow language prefixes , change patters to i18n_patterns in urls.py:
1urlpatterns += i18n_patterns( 2 ... 3)
Tag: cheminformatics
Blog
Obtaining Molecule Description using Open Babel / PyBel
Open Babel is a great tool to analyze and investigate molecular data (.MOL, .SDF files). Its Python API is particularly very nice if you are familiar with Python already. In this post, I’ll demonstrate how you can obtain molecule description such as molecular weight, HBA, HBD, logP, formula, number of chiral centers using PyBel.
Installation
1$ sudo apt-get install openbabel python-openbabel Usage for MW, HBA, HBD, logP
After reading .MOL file, we need to use calcdesc method with descnames argument for getting the descriptions.
Tag: chiral centers
Blog
Obtaining Molecule Description using Open Babel / PyBel
Open Babel is a great tool to analyze and investigate molecular data (.MOL, .SDF files). Its Python API is particularly very nice if you are familiar with Python already. In this post, I’ll demonstrate how you can obtain molecule description such as molecular weight, HBA, HBD, logP, formula, number of chiral centers using PyBel.
Installation
1$ sudo apt-get install openbabel python-openbabel Usage for MW, HBA, HBD, logP
After reading .MOL file, we need to use calcdesc method with descnames argument for getting the descriptions.
Tag: clogp
Blog
Obtaining Molecule Description using Open Babel / PyBel
Open Babel is a great tool to analyze and investigate molecular data (.MOL, .SDF files). Its Python API is particularly very nice if you are familiar with Python already. In this post, I’ll demonstrate how you can obtain molecule description such as molecular weight, HBA, HBD, logP, formula, number of chiral centers using PyBel.
Installation
1$ sudo apt-get install openbabel python-openbabel Usage for MW, HBA, HBD, logP
After reading .MOL file, we need to use calcdesc method with descnames argument for getting the descriptions.
Tag: hba
Blog
Obtaining Molecule Description using Open Babel / PyBel
Open Babel is a great tool to analyze and investigate molecular data (.MOL, .SDF files). Its Python API is particularly very nice if you are familiar with Python already. In this post, I’ll demonstrate how you can obtain molecule description such as molecular weight, HBA, HBD, logP, formula, number of chiral centers using PyBel.
Installation
1$ sudo apt-get install openbabel python-openbabel Usage for MW, HBA, HBD, logP
After reading .MOL file, we need to use calcdesc method with descnames argument for getting the descriptions.
Tag: hbd
Blog
Obtaining Molecule Description using Open Babel / PyBel
Open Babel is a great tool to analyze and investigate molecular data (.MOL, .SDF files). Its Python API is particularly very nice if you are familiar with Python already. In this post, I’ll demonstrate how you can obtain molecule description such as molecular weight, HBA, HBD, logP, formula, number of chiral centers using PyBel.
Installation
1$ sudo apt-get install openbabel python-openbabel Usage for MW, HBA, HBD, logP
After reading .MOL file, we need to use calcdesc method with descnames argument for getting the descriptions.
Tag: logp
Blog
Obtaining Molecule Description using Open Babel / PyBel
Open Babel is a great tool to analyze and investigate molecular data (.MOL, .SDF files). Its Python API is particularly very nice if you are familiar with Python already. In this post, I’ll demonstrate how you can obtain molecule description such as molecular weight, HBA, HBD, logP, formula, number of chiral centers using PyBel.
Installation
1$ sudo apt-get install openbabel python-openbabel Usage for MW, HBA, HBD, logP
After reading .MOL file, we need to use calcdesc method with descnames argument for getting the descriptions.
Tag: molecular weight
Blog
Obtaining Molecule Description using Open Babel / PyBel
Open Babel is a great tool to analyze and investigate molecular data (.MOL, .SDF files). Its Python API is particularly very nice if you are familiar with Python already. In this post, I’ll demonstrate how you can obtain molecule description such as molecular weight, HBA, HBD, logP, formula, number of chiral centers using PyBel.
Installation
1$ sudo apt-get install openbabel python-openbabel Usage for MW, HBA, HBD, logP
After reading .MOL file, we need to use calcdesc method with descnames argument for getting the descriptions.
Tag: molecule
Blog
Obtaining Molecule Description using Open Babel / PyBel
Open Babel is a great tool to analyze and investigate molecular data (.MOL, .SDF files). Its Python API is particularly very nice if you are familiar with Python already. In this post, I’ll demonstrate how you can obtain molecule description such as molecular weight, HBA, HBD, logP, formula, number of chiral centers using PyBel.
Installation
1$ sudo apt-get install openbabel python-openbabel Usage for MW, HBA, HBD, logP
After reading .MOL file, we need to use calcdesc method with descnames argument for getting the descriptions.
Tag: pybel
Blog
Obtaining Molecule Description using Open Babel / PyBel
Open Babel is a great tool to analyze and investigate molecular data (.MOL, .SDF files). Its Python API is particularly very nice if you are familiar with Python already. In this post, I’ll demonstrate how you can obtain molecule description such as molecular weight, HBA, HBD, logP, formula, number of chiral centers using PyBel.
Installation
1$ sudo apt-get install openbabel python-openbabel Usage for MW, HBA, HBD, logP
After reading .MOL file, we need to use calcdesc method with descnames argument for getting the descriptions.
Tag: csh
Blog
Running Script on Cluster (StarCluster)
Start a new cluster with the configuration file you modified:
starcluster start cluster_name Send the script to the running cluster:
starcluster put cluster_name myscr.csh /home/myscr.csh Run it using source:
starcluster sshmaster cluster_name "source /home/myscr.csh >& /home/myscr.log"
Tag: put
Blog
Running Script on Cluster (StarCluster)
Start a new cluster with the configuration file you modified:
starcluster start cluster_name Send the script to the running cluster:
starcluster put cluster_name myscr.csh /home/myscr.csh Run it using source:
starcluster sshmaster cluster_name "source /home/myscr.csh >& /home/myscr.log"
Tag: script
Blog
Running Script on Cluster (StarCluster)
Start a new cluster with the configuration file you modified:
starcluster start cluster_name Send the script to the running cluster:
starcluster put cluster_name myscr.csh /home/myscr.csh Run it using source:
starcluster sshmaster cluster_name "source /home/myscr.csh >& /home/myscr.log"
Blog
MegaBLAST Sonuclarini Incelemek - Parsing
Pipeline’da son asama, aranan dizilerin urettigi ciktilari baska bir script ile incelemek. Bu islemle herbir megablast dosyasi okunuyor, ve dizilerin name, identity, overlapping length gibi parametrelerinin degerleri saklanarak amaca yonelik sekilde ekrana yazdiriliyor.
Projemde HUSAR paketinde bulunan ve yukarida bahsettigim alanlari bana dizi olarak donduren Inslink adinda bir parser kullaniyorum. Bu parserin yaptigi tek sey, dosyayi okumak ve dosyadaki istenen alanlarin degerlerini saklamak.
Daha sonra ben bu saklanan degerleri, koda eklemeler yaparak gosteriyorum ve birkac ek kod ile de ihtiyacim olan anlamli sonuclar gosteriyorum.
Blog
Yeni Verisetinin Incelenmesi
Pipeline’i tasarlama asamasinda deneme amacli kullandigim onceki verinin cok kotu olmasi sebebiyle yeni bir veriseti aldim. Elbette deneme asamasinda birden fazla, farkli karakterlerde verisetleri kullanmak yararlidir. Ancak onceki veriseti anlamli birkac sonuc veremeyecek kadar kotuydu diyebilirim. Ayrintilarina [buradan]({% post_url 2012-07-06-eslestirme-ve-eslesmeyen-okumalari %}) gozatabilirsiniz.
Yeni veriseti, gene bir insan genomu verisi ve BAM dosyasinin boyutu 1.8 GB ve icinde eslenebilen ve eslenemeyen okumalari bulunduruyordu. Ben bam2fastq araciyla hem bu BAM dosyasini FASTQ dosyasina cevirirken hem de eslenebilen okumalardan ayiklayarak 0.
Blog
Birden Fazla Dizi Dosyalarindan MegaBLAST'i Calistirmak
Asagidaki scripti, pipeline’in MegaBLAST aramasini daha hizli yapabilmek icin dusundugumuz bir teknige uygun olabilmesi icin yazdim. Yaptigi sey, her okuma icin olusturulmus ve formatlanmis dizi dosyalarini kullanarak veritabanlarinda belirtilen baslangic noktasi ve okuma sayisi ile arama yapmak.
1#!user/local/bin/perl 2 3$database = $ARGV[0]; 4$dir = $ARGV[1]; #directory for sequences 5$sp = $ARGV[2]; #starting point 6$n = $ARGV[3] + $sp; 7 8while (1) { 9 system("blastplus -programname=megablast $dir/read_$sp.seq $database -OUTFILE=read_$sp.megablast -nobatch -d"); 10 $sp++; 11 last if ($sp == $n); 12} Burada her sey gercekten cok basit bir programlama ile isliyor.
Blog
FASTQ'dan FASTA'ya Donusturme Perl Scripti
FASTQ ve FASTA formatlari aslinda ayni bilgiyi iceren ancak birinde sadece herbir dizi icin iki satir daha az bilginin bulundugu dosya formatlari. Projemde onemli olan diger bir farklari ise FASTA formatinin direkt olarak MegaBLAST arama yapilabilmesi. Iste bu yuzden, genetik dizilim yapan makinelerin olusturdugu FASTQ formatini FASTA’ya cevirmem gerekiyor. Ve bu script pipeline’in ilk adimi.
Aslinda deneme amacli aldigim genetk dizilimin, bana bunu ulastiran tarafindan eslestirmesinin yapilmadigi icin, bir on adim olarak bu eslestirmeyi yapmistim.
Tag: shell
Blog
Running Script on Cluster (StarCluster)
Start a new cluster with the configuration file you modified:
starcluster start cluster_name Send the script to the running cluster:
starcluster put cluster_name myscr.csh /home/myscr.csh Run it using source:
starcluster sshmaster cluster_name "source /home/myscr.csh >& /home/myscr.log"
Tag: sshmaster
Blog
Running Script on Cluster (StarCluster)
Start a new cluster with the configuration file you modified:
starcluster start cluster_name Send the script to the running cluster:
starcluster put cluster_name myscr.csh /home/myscr.csh Run it using source:
starcluster sshmaster cluster_name "source /home/myscr.csh >& /home/myscr.log"
Tag: scp
Blog
Uploading Files to AWS using SSH/SCP
Here is a small command for uploading files to AWS through SSH’s command scp (secure copy).
scp -i path/to/your/key-pairs/file path/to/file/you/want/to/upload ubuntu@PUBLIC_DNS:path/to/the/destination
Tag: upload
Blog
Uploading Files to AWS using SSH/SCP
Here is a small command for uploading files to AWS through SSH’s command scp (secure copy).
scp -i path/to/your/key-pairs/file path/to/file/you/want/to/upload ubuntu@PUBLIC_DNS:path/to/the/destination
Tag: permission denied
Blog
Errno 13 Permission denied Django File Uploads
Run following command to give www-data permissions to static folder and all its content:
cd path/to/your/django/project sudo chown -R www-data:www-data static/ Do this in your production server
Blog
session_start() Permission denied (13) Laravel 4
Solve it by running following lines:
chmod -R 755 /path/to/your/laravel/directory chmod -R o+w /path/to/your/laravel/directory And/or maybe:
sudo chown -R www-data:user /path/to/your/laravel/directory
Blog
Permission Issues develop Laravel 4 on Ubuntu 12.04 LTS
If your CSS or JS files don’t seem to load or you get 403 Forbidden or Permissions denied, all you need to do is to run following on terminal:
sudo chmod -R 755 /path/to/your/laravel/directory
Tag: apache
Blog
Configuring Mezzanine for Apache server & mod_wsgi in AWS
Install [Mezzanine]({% post_url 2015-05-01-how-to-install-mezzanine-on-ubuntulinux-mint %}), [Apache server]({% post_url 2015-05-08-getting-started-with-your-aws-instance-and %}) and mod_wsgi:
sudo apt-get install libapache2-mod-wsgi sudo a2enmod wsgi Set up a MySQL database for your Mezzanine project
Read [my post on how to set up a MySQL database for a Mezzanine project]({% post_url 2015-05-09-how-to-set-up-a-mysql-database-for-a-mezzanine %})
Collect static files:
python manage.py collectstatic Configure your Apache server configuration for the project like following:
WSGIPythonPath /home/ubuntu/www/mezzanine-project <VirtualHost *:80> #ServerName example.com ServerAdmin admin@example.com DocumentRoot /home/ubuntu/www/mezzanine-project WSGIScriptAlias / /home/ubuntu/www/mezzanine-project/wsgi.
Tag: wsgi
Blog
Configuring Mezzanine for Apache server & mod_wsgi in AWS
Install [Mezzanine]({% post_url 2015-05-01-how-to-install-mezzanine-on-ubuntulinux-mint %}), [Apache server]({% post_url 2015-05-08-getting-started-with-your-aws-instance-and %}) and mod_wsgi:
sudo apt-get install libapache2-mod-wsgi sudo a2enmod wsgi Set up a MySQL database for your Mezzanine project
Read [my post on how to set up a MySQL database for a Mezzanine project]({% post_url 2015-05-09-how-to-set-up-a-mysql-database-for-a-mezzanine %})
Collect static files:
python manage.py collectstatic Configure your Apache server configuration for the project like following:
WSGIPythonPath /home/ubuntu/www/mezzanine-project <VirtualHost *:80> #ServerName example.com ServerAdmin admin@example.com DocumentRoot /home/ubuntu/www/mezzanine-project WSGIScriptAlias / /home/ubuntu/www/mezzanine-project/wsgi.
Tag: custom tcp
Blog
Setting Up Mezzanine Projects in AWS
Go to EC2 management console, Security groups and add a Custom TCP inbound rule with port 8000. Select “Anywhere” from the list.
Then follow [this to install Mezzanine]({% post_url 2015-05-01-how-to-install-mezzanine-on-ubuntulinux-mint %})
Above tutorial is also explains setting up a site record. Mezzanine default site record is 127.0.0.1:8000 which should be 0.0.0.0:8000 in our case. So, enter 0.0.0.0:8000 when you’re asked to enter a site record when you ru
python manage.py createdb Also, you might still need to provide this site record while running the development server:
Tag: ec2
Blog
Setting Up Mezzanine Projects in AWS
Go to EC2 management console, Security groups and add a Custom TCP inbound rule with port 8000. Select “Anywhere” from the list.
Then follow [this to install Mezzanine]({% post_url 2015-05-01-how-to-install-mezzanine-on-ubuntulinux-mint %})
Above tutorial is also explains setting up a site record. Mezzanine default site record is 127.0.0.1:8000 which should be 0.0.0.0:8000 in our case. So, enter 0.0.0.0:8000 when you’re asked to enter a site record when you ru
python manage.py createdb Also, you might still need to provide this site record while running the development server:
Tag: chmod
Blog
AWS Start an Instance and Connect to it
Go to EC2 management console
Create a new key-pair if necessary and download it
Launch an instance
Add HTTP security group for web applications over HTTP
Get public DNS
Change permissions on key-pair file:
1chmod 400 path/to/your/file.pem Connect:
1ssh -i path/to/your/file.pem ubuntu@PUBLIC_DNS Note: ubuntu is for connecting an Ubuntu 64 bit instance. It’s different for others
Tag: launch instance
Blog
AWS Start an Instance and Connect to it
Go to EC2 management console
Create a new key-pair if necessary and download it
Launch an instance
Add HTTP security group for web applications over HTTP
Get public DNS
Change permissions on key-pair file:
1chmod 400 path/to/your/file.pem Connect:
1ssh -i path/to/your/file.pem ubuntu@PUBLIC_DNS Note: ubuntu is for connecting an Ubuntu 64 bit instance. It’s different for others
Tag: directory
Blog
How to Get Path to or Directory of Current Script in R
Use following code to get the path to or directory of current (running) script in R:
1scr_dir <- dirname(sys.frame(1)$ofile) 2scr_path <- paste(scr_dir, "script.R", sep="/") Taken from SO
Tag: path
Blog
How to Get Path to or Directory of Current Script in R
Use following code to get the path to or directory of current (running) script in R:
1scr_dir <- dirname(sys.frame(1)$ofile) 2scr_path <- paste(scr_dir, "script.R", sep="/") Taken from SO
Tag: rscript
Blog
How to Get Path to or Directory of Current Script in R
Use following code to get the path to or directory of current (running) script in R:
1scr_dir <- dirname(sys.frame(1)$ofile) 2scr_path <- paste(scr_dir, "script.R", sep="/") Taken from SO
Tag: bioconductor
Blog
How to Get (or Load) NCBI GEO Microarray Data into R using GEOquery Package from Bioconductor
R, especially with lots of Bioconductor packages, provides nice tools to load, manage and analyze microarray data. If you are trying to load NCBI GEO data into R, use GEOquery package. Here, I’ll describe how to start with it and probably in my future posts I’ll mention more.
Installation
1source("http://bioconductor.org/biocLite.R") 2biocLite("GEOquery") Usage
1library(GEOquery) 2gds <- getGEO("GDS5072") or
1library(GEOquery) 2gds <- getGEO(filename="path/to/GDS5072.soft.gz") getGEO function return a complex class type GDS object which contains the complete dataset.
Blog
Progress on Network Inference Sub-Challenge
This sub-challenge has several requirements:
Directed and causal edges on the models (32 models - 4 cell lines × 8 stimuli) Edges should be scored (normalizing to range between 0 and 1) that will show confidence Nodes will be phosphoproteins from the data Prior knowledge network (that can be constructed using pathway databases) is allowed to be used (actually this is a must for some network inference tools) First thing was to look for existing tools.
Tag: gds
Blog
How to Get (or Load) NCBI GEO Microarray Data into R using GEOquery Package from Bioconductor
R, especially with lots of Bioconductor packages, provides nice tools to load, manage and analyze microarray data. If you are trying to load NCBI GEO data into R, use GEOquery package. Here, I’ll describe how to start with it and probably in my future posts I’ll mention more.
Installation
1source("http://bioconductor.org/biocLite.R") 2biocLite("GEOquery") Usage
1library(GEOquery) 2gds <- getGEO("GDS5072") or
1library(GEOquery) 2gds <- getGEO(filename="path/to/GDS5072.soft.gz") getGEO function return a complex class type GDS object which contains the complete dataset.
Tag: geo
Blog
How to Get (or Load) NCBI GEO Microarray Data into R using GEOquery Package from Bioconductor
R, especially with lots of Bioconductor packages, provides nice tools to load, manage and analyze microarray data. If you are trying to load NCBI GEO data into R, use GEOquery package. Here, I’ll describe how to start with it and probably in my future posts I’ll mention more.
Installation
1source("http://bioconductor.org/biocLite.R") 2biocLite("GEOquery") Usage
1library(GEOquery) 2gds <- getGEO("GDS5072") or
1library(GEOquery) 2gds <- getGEO(filename="path/to/GDS5072.soft.gz") getGEO function return a complex class type GDS object which contains the complete dataset.
Tag: geoquery
Blog
How to Get (or Load) NCBI GEO Microarray Data into R using GEOquery Package from Bioconductor
R, especially with lots of Bioconductor packages, provides nice tools to load, manage and analyze microarray data. If you are trying to load NCBI GEO data into R, use GEOquery package. Here, I’ll describe how to start with it and probably in my future posts I’ll mention more.
Installation
1source("http://bioconductor.org/biocLite.R") 2biocLite("GEOquery") Usage
1library(GEOquery) 2gds <- getGEO("GDS5072") or
1library(GEOquery) 2gds <- getGEO(filename="path/to/GDS5072.soft.gz") getGEO function return a complex class type GDS object which contains the complete dataset.
Tag: load microarray data
Blog
How to Get (or Load) NCBI GEO Microarray Data into R using GEOquery Package from Bioconductor
R, especially with lots of Bioconductor packages, provides nice tools to load, manage and analyze microarray data. If you are trying to load NCBI GEO data into R, use GEOquery package. Here, I’ll describe how to start with it and probably in my future posts I’ll mention more.
Installation
1source("http://bioconductor.org/biocLite.R") 2biocLite("GEOquery") Usage
1library(GEOquery) 2gds <- getGEO("GDS5072") or
1library(GEOquery) 2gds <- getGEO(filename="path/to/GDS5072.soft.gz") getGEO function return a complex class type GDS object which contains the complete dataset.
Tag: microarray data
Blog
How to Get (or Load) NCBI GEO Microarray Data into R using GEOquery Package from Bioconductor
R, especially with lots of Bioconductor packages, provides nice tools to load, manage and analyze microarray data. If you are trying to load NCBI GEO data into R, use GEOquery package. Here, I’ll describe how to start with it and probably in my future posts I’ll mention more.
Installation
1source("http://bioconductor.org/biocLite.R") 2biocLite("GEOquery") Usage
1library(GEOquery) 2gds <- getGEO("GDS5072") or
1library(GEOquery) 2gds <- getGEO(filename="path/to/GDS5072.soft.gz") getGEO function return a complex class type GDS object which contains the complete dataset.
Tag: ncbi
Blog
How to Get (or Load) NCBI GEO Microarray Data into R using GEOquery Package from Bioconductor
R, especially with lots of Bioconductor packages, provides nice tools to load, manage and analyze microarray data. If you are trying to load NCBI GEO data into R, use GEOquery package. Here, I’ll describe how to start with it and probably in my future posts I’ll mention more.
Installation
1source("http://bioconductor.org/biocLite.R") 2biocLite("GEOquery") Usage
1library(GEOquery) 2gds <- getGEO("GDS5072") or
1library(GEOquery) 2gds <- getGEO(filename="path/to/GDS5072.soft.gz") getGEO function return a complex class type GDS object which contains the complete dataset.
Blog
An Exon of Length 2 Appeared in Ensembl
I want to share an interesting finding about our research on exon/intron analysis of human evolutionary history.
So I had the genes that emerged at each pass point of human history and I was using Ensembl API to get exons and introns of these genes to perform further analyses.
There was one gene (ENSG00000197568 - HERV-H LTR-associating 3 - HHLA3) with a surprise. Because it’s one transcript (ENST00000432224) had an exon (ENSE00001707577) of length 2.
Blog
Duzenli Ifadeler ile Tur Ismini Elde Etmek
Projemin sonunda kullaniciya olasi kirleten organizmalarin adlarini (Latince tur isimleri) gosterecegim icin, MegaBLAST sonuclarindaki erisim numaralarini (accession number) kullanarak her dizi icin organizma adlarini elde etmem gerekiyor. Sequence Retrival System (SRS) adinda, HUSAR sunucularinda bulunan baska bir sistem ile bunu yapabiliyorum.
SRS’ten organizma adini ogrenebilmem icin Unix komut satirinda “getz” komutuyla birlikte veritabani ismi, erisim numarasi ve ogrenmek istedigim alani yazmam yetiyor. Asagida, bu isi yapabilen ornek bir kod bulabilirsiniz.
Blog
Veritabani Secimi
Bu projedeki amacim olasi kirleten organizmalari (kontaminantlari) bulmak. Dolayisiyla genis bir veritabanina ihtiyacim var. Ancak veritabanini genis tutmak boyle bir avantaj sagliyorken, her dizi icin o veritabaninda arama yapmak oldukca fazla bilgisayar gucu ve zaman gerektiriyor. Bu yuzden projemi gelistirirken, cesitli veritabanlarini da inceliyorum. Ve ayrica bunlari nasil kisitlayarak, amacim icin en uygun hale getirebilecegimi arastiriyorum.
Ilk olarak NCBI’in Reference Sequence (Kaynak Dizi ya da Referans Sekans) – RefSeq – veritabaniyla basladim.
Blog
Biyoenformatik Nedir? Biyoenformatik'in Tanımı
Birçok organizmanın ve son olarak da 2001’de insan genomunun çıkarılmasıyla, tüm 3 milyar baz çiftinin diziliminin elde edilmesiyle, karşımıza bu bilgiyi farklı şekillerde kullanacak olan alanlar çıktı.Bu genleri anlamaya çalışan, bu genlerden oluşacak proteinleri belirlemeye çalışan alanların yanında bu bilginin analizini yapma ihtiyacı da Biyoenformatik alanını doğurdu.
Biyoenformatik, biyolojik bilginin bilgisayarlar ve istatistiksel teknikler kullanılarak analiz edilmesidir; başka bir deyişle, biyoenformatik, biyolojik araştırmaları iyileştirmek ve hızlandırmak için bilgisayar veri tabanları ve algoritmaları geliştirme ve onlardan yarar sağlama bilimidir [1].
Tag: virtualenv
Blog
How to Install Mezzanine on Ubuntu/Linux Mint [Complete Guide]
Mezzanine is a CMS application built on Django web framework. The installation steps are easy but your environment may not just suitable enough for it work without a problem. So, here I’m going to describe complete installation from scratch on a virtual environment.
First of all, install virtualenv:
$ sudo apt-get install python-virtualenv Then, create a virtual environment:
$ virtualenv testenv And, activate it: $ cd testenv $ source bin/activate
Tag: clear
Blog
How to Clear (or Drop) DB Table of A Django App
Let’s say you created a Django app and ran python manage.py syncdb and created its table. Everytime you make a change in the table, you’ll need to drop that table and run python manage.py syncdb again to update. And how you drop a table of a Django app:
$ python manage.py sqlclear app_name | python manage.py dbshell Drop tables of an app with migrations (Django >= 1.8):
$ python manage.py migrate appname zero Recreate all the tables:
Tag: django app
Blog
How to Clear (or Drop) DB Table of A Django App
Let’s say you created a Django app and ran python manage.py syncdb and created its table. Everytime you make a change in the table, you’ll need to drop that table and run python manage.py syncdb again to update. And how you drop a table of a Django app:
$ python manage.py sqlclear app_name | python manage.py dbshell Drop tables of an app with migrations (Django >= 1.8):
$ python manage.py migrate appname zero Recreate all the tables:
Tag: sqlite3
Blog
How to Clear (or Drop) DB Table of A Django App
Let’s say you created a Django app and ran python manage.py syncdb and created its table. Everytime you make a change in the table, you’ll need to drop that table and run python manage.py syncdb again to update. And how you drop a table of a Django app:
$ python manage.py sqlclear app_name | python manage.py dbshell Drop tables of an app with migrations (Django >= 1.8):
$ python manage.py migrate appname zero Recreate all the tables:
Tag: tables
Blog
How to Clear (or Drop) DB Table of A Django App
Let’s say you created a Django app and ran python manage.py syncdb and created its table. Everytime you make a change in the table, you’ll need to drop that table and run python manage.py syncdb again to update. And how you drop a table of a Django app:
$ python manage.py sqlclear app_name | python manage.py dbshell Drop tables of an app with migrations (Django >= 1.8):
$ python manage.py migrate appname zero Recreate all the tables:
Tag: betweenness centrality
Blog
Salmonella - Host Interaction Network - A Detailed, Better Visualization
We’re almost done with the analyses and we’re making the final visualization of the network. As I previously posted, the network was clustered and visualized by time points. After that, we have done several more analyses and here I report how we visualized them. I’m going to post more about how we did the analyses separately.
First, the nodes are grouped into experimental and not experimental (PCSF nodes). This can easily be done by parsing experimental network output and network outputs of PCSF.
Tag: cytoscape
Blog
Salmonella - Host Interaction Network - A Detailed, Better Visualization
We’re almost done with the analyses and we’re making the final visualization of the network. As I previously posted, the network was clustered and visualized by time points. After that, we have done several more analyses and here I report how we visualized them. I’m going to post more about how we did the analyses separately.
First, the nodes are grouped into experimental and not experimental (PCSF nodes). This can easily be done by parsing experimental network output and network outputs of PCSF.
Blog
Network Clustering with NeAT - RNSC Algorithm
As we have obtained proteins at different times points from the experimental data, then we have found intermediate nodes (from human interactome) using PCSF algorithm and finally with a special matrix from the network that PCSF created, we have validated the edges and also determined edge directions using an approach which a divide and conquer (ILP) approach for construction of large-scale signaling networks from PPI data. The resulting network is a directed network and will be used and visualized for further analyses.
Blog
Network Visualization Using Cytoscape
Cytoscape is a nice tool to visualize network for better understanding and delivery. I used it for in silico data network visualization and the result was really pretty. Now, I have networks constructed using experimental data from HPN-DREAM Challenge.
In this post, I want to demonstrate how to visualize a network with scores. I’m using Cytoscape 2.8 on Ubuntu 12.
First, the network will be read from a SIF file which is default format of Cytoscape for networks.
Blog
In silico Network Inference Last Improvements and Visualization of Result in Cytoscape
I’m almost done with the analysis of in silico data, although I need to decide if I need further analysis with the inhibiting parent nodes in the network. Last, I couldn’t filter out duplicate edges, which were scored differently. Now, with some improvements in the script, low scores duplicates are filtered and there is a better final list of edges which is ready to be visualized.
I also tried visualizing it on Cytoscape.
Tag: in degree
Blog
Salmonella - Host Interaction Network - A Detailed, Better Visualization
We’re almost done with the analyses and we’re making the final visualization of the network. As I previously posted, the network was clustered and visualized by time points. After that, we have done several more analyses and here I report how we visualized them. I’m going to post more about how we did the analyses separately.
First, the nodes are grouped into experimental and not experimental (PCSF nodes). This can easily be done by parsing experimental network output and network outputs of PCSF.
Tag: network
Blog
Salmonella - Host Interaction Network - A Detailed, Better Visualization
We’re almost done with the analyses and we’re making the final visualization of the network. As I previously posted, the network was clustered and visualized by time points. After that, we have done several more analyses and here I report how we visualized them. I’m going to post more about how we did the analyses separately.
First, the nodes are grouped into experimental and not experimental (PCSF nodes). This can easily be done by parsing experimental network output and network outputs of PCSF.
Blog
GO Enrichment of Network Clusters
In my previous post, I mentioned how I clustered the network we obtained at the end. For functional annotation gene ontology (GO) enrichment has been done on these clusters.
There were 20 clusters and the HGNC names are obtained separately for each cluster and using DAVID functional annotation tool API, GO and pathway annotations are collected per cluster and these are saved separately.
http://david.abcc.ncifcrf.gov/api.jsp?type=OFFICIAL_GENE_SYMBOL&tool=chartReport&annot=GOTERM_BP_FAT,GOTERM_CC_FAT,GOTERM_MF_FAT,BBID,BIOCARTA,KEGG_PATHWAY&ids=HGNC_NAME1,HGNC_NAME2,HGNC_NAME3,... Above URL was used to obtain chart report for some GO and pathways chart records.
Blog
Network Clustering with NeAT - RNSC Algorithm
As we have obtained proteins at different times points from the experimental data, then we have found intermediate nodes (from human interactome) using PCSF algorithm and finally with a special matrix from the network that PCSF created, we have validated the edges and also determined edge directions using an approach which a divide and conquer (ILP) approach for construction of large-scale signaling networks from PPI data. The resulting network is a directed network and will be used and visualized for further analyses.
Blog
Reconstructed Salmonella Signaling Network Visualized and Colored
After fold changes were obtained and HGNC names were found for each phosphopeptide, these were used to construct Salmonella signaling network using PCSF and then with the nodes that PCSF found as well, we generated a matrix which has node in the rows and time points in the columns and each cell shows the presence of corresponding protein under the corresponding time point(s).
The matrix has 658 nodes (proteins) and 4 time points as indicated before: 2 min, 5 min, 10 min and 20 min.
Blog
Multi-dimensional Modeling and Reconstruction of Signaling Networks in Salmonella-infected Human Cells
In this study, we’re going to use a phosphorylation data from a research paper on phosphoproteomic analysis of related cells.
The idea is to use and compare existing methods and develop these methods to be able to better understand the nature of signaling events in these cells and to find key proteins that might be targets for disease diagnosis, prevention and treatment.
This study will be submitted as a research paper so I’m not going to publish any results here for now but I’ll mention the struggles I have and solutions I try to solve them.
Blog
Last Submissions to the Challenge
Today, I submitted in silico and experimental data network inference results on Synapse for the next leaderboard on this Wednesday.
For experimental part, I had to exclude edges with FGFR1 and FGFR3 because the data lacks phosphorylated forms of these proteins and networks must be constructed using only phosphoproteins in the data.
Since there was an update for in silico part, I had to modify the script and resubmit the results.
Blog
Network Visualization Using Cytoscape
Cytoscape is a nice tool to visualize network for better understanding and delivery. I used it for in silico data network visualization and the result was really pretty. Now, I have networks constructed using experimental data from HPN-DREAM Challenge.
In this post, I want to demonstrate how to visualize a network with scores. I’m using Cytoscape 2.8 on Ubuntu 12.
First, the network will be read from a SIF file which is default format of Cytoscape for networks.
Blog
Plotting Expression Curves for Experimental Data
As I can plot expression curves for in silico data. I moved on experimental data which is more complex and larger. This data is the result of RPPA experiments on different breast cancer cell lines and it includes protein abundance measurements for about 45 phophoproteins. These phosphoproteins are treated with different inhibitors and stimuli and by comparing their expressions, I will try to infer relations between them.
Before moving on inferring part, I want to have a script that can plot the graphs so that I can see particular results for specific cases.
Blog
Experimental Data Optimization for Network Inference
As I mentioned in my previous post, experimental data from the challenge has missing data values that create problems during analyses. To solve it, first thing I did was to optimize data, which includes detecting missing conditions and putting NAs for data values and sorting them if necessary.
I wrote two functions in the script. First one ranks the data according to the fashion and sorts it based on these ranks.
Blog
Working with Experimental Data from Network Inference Challenge
As I almost finished with in silico data, I moved on to analyses of experimental data using the same script. But since the characteristics of data is somehow different, before inferring network, I need to modify the script to be able to read experimental data files.
These differences include missing data values for some conditions. This makes analyses difficult because I have to estimate a value for them and this will decrease the confidence score of edges.
Blog
In silico Network Inference Last Improvements and Visualization of Result in Cytoscape
I’m almost done with the analysis of in silico data, although I need to decide if I need further analysis with the inhibiting parent nodes in the network. Last, I couldn’t filter out duplicate edges, which were scored differently. Now, with some improvements in the script, low scores duplicates are filtered and there is a better final list of edges which is ready to be visualized.
I also tried visualizing it on Cytoscape.
Tag: network visualization
Blog
Salmonella - Host Interaction Network - A Detailed, Better Visualization
We’re almost done with the analyses and we’re making the final visualization of the network. As I previously posted, the network was clustered and visualized by time points. After that, we have done several more analyses and here I report how we visualized them. I’m going to post more about how we did the analyses separately.
First, the nodes are grouped into experimental and not experimental (PCSF nodes). This can easily be done by parsing experimental network output and network outputs of PCSF.
Tag: networkx
Blog
Salmonella - Host Interaction Network - A Detailed, Better Visualization
We’re almost done with the analyses and we’re making the final visualization of the network. As I previously posted, the network was clustered and visualized by time points. After that, we have done several more analyses and here I report how we visualized them. I’m going to post more about how we did the analyses separately.
First, the nodes are grouped into experimental and not experimental (PCSF nodes). This can easily be done by parsing experimental network output and network outputs of PCSF.
Blog
Finding k-cores and Clustering Coefficient Computation with NetworkX
Assume you have a large network and you want to find k-cores of each node and also you want to compute clustering coefficient for each one. Python package NetworkX comes with very nice methods for you to easily do these.
k-core is a maximal subgraph whose nodes are at least k degree [1]. To find k-cores:
Add all edges you have in your network in a NetworkX graph, and use core_number method that gets graph as the single input and returns node – k-core pairs.
Tag: out degree
Blog
Salmonella - Host Interaction Network - A Detailed, Better Visualization
We’re almost done with the analyses and we’re making the final visualization of the network. As I previously posted, the network was clustered and visualized by time points. After that, we have done several more analyses and here I report how we visualized them. I’m going to post more about how we did the analyses separately.
First, the nodes are grouped into experimental and not experimental (PCSF nodes). This can easily be done by parsing experimental network output and network outputs of PCSF.
Tag: path damaging
Blog
Salmonella - Host Interaction Network - A Detailed, Better Visualization
We’re almost done with the analyses and we’re making the final visualization of the network. As I previously posted, the network was clustered and visualized by time points. After that, we have done several more analyses and here I report how we visualized them. I’m going to post more about how we did the analyses separately.
First, the nodes are grouped into experimental and not experimental (PCSF nodes). This can easily be done by parsing experimental network output and network outputs of PCSF.
Tag: pcsf
Blog
Salmonella - Host Interaction Network - A Detailed, Better Visualization
We’re almost done with the analyses and we’re making the final visualization of the network. As I previously posted, the network was clustered and visualized by time points. After that, we have done several more analyses and here I report how we visualized them. I’m going to post more about how we did the analyses separately.
First, the nodes are grouped into experimental and not experimental (PCSF nodes). This can easily be done by parsing experimental network output and network outputs of PCSF.
Blog
Network Clustering with NeAT - RNSC Algorithm
As we have obtained proteins at different times points from the experimental data, then we have found intermediate nodes (from human interactome) using PCSF algorithm and finally with a special matrix from the network that PCSF created, we have validated the edges and also determined edge directions using an approach which a divide and conquer (ILP) approach for construction of large-scale signaling networks from PPI data. The resulting network is a directed network and will be used and visualized for further analyses.
Blog
Reconstructed Salmonella Signaling Network Visualized and Colored
After fold changes were obtained and HGNC names were found for each phosphopeptide, these were used to construct Salmonella signaling network using PCSF and then with the nodes that PCSF found as well, we generated a matrix which has node in the rows and time points in the columns and each cell shows the presence of corresponding protein under the corresponding time point(s).
The matrix has 658 nodes (proteins) and 4 time points as indicated before: 2 min, 5 min, 10 min and 20 min.
Tag: visualization
Blog
Salmonella - Host Interaction Network - A Detailed, Better Visualization
We’re almost done with the analyses and we’re making the final visualization of the network. As I previously posted, the network was clustered and visualized by time points. After that, we have done several more analyses and here I report how we visualized them. I’m going to post more about how we did the analyses separately.
First, the nodes are grouped into experimental and not experimental (PCSF nodes). This can easily be done by parsing experimental network output and network outputs of PCSF.
Blog
Network Clustering with NeAT - RNSC Algorithm
As we have obtained proteins at different times points from the experimental data, then we have found intermediate nodes (from human interactome) using PCSF algorithm and finally with a special matrix from the network that PCSF created, we have validated the edges and also determined edge directions using an approach which a divide and conquer (ILP) approach for construction of large-scale signaling networks from PPI data. The resulting network is a directed network and will be used and visualized for further analyses.
Blog
In silico Network Inference Last Improvements and Visualization of Result in Cytoscape
I’m almost done with the analysis of in silico data, although I need to decide if I need further analysis with the inhibiting parent nodes in the network. Last, I couldn’t filter out duplicate edges, which were scored differently. Now, with some improvements in the script, low scores duplicates are filtered and there is a better final list of edges which is ready to be visualized.
I also tried visualizing it on Cytoscape.
Blog
DREAM Breast Cancer Sub-challenges
I have been going over the sub-challenges before attempting to solve them. As I mentioned, there are three sub-challenges and somehow they are connected.
First, using given data and other possible data sources such as pathway databases, the causal signaling network of the phosphoproteins. There are 4 cell lines and 8 stimulus so they make total 32 networks at the end. Nodes are phosphoproteins and edges should be directed and causal (activator or inhibitor).
Tag: chart reports
Blog
GO Enrichment of Network Clusters
In my previous post, I mentioned how I clustered the network we obtained at the end. For functional annotation gene ontology (GO) enrichment has been done on these clusters.
There were 20 clusters and the HGNC names are obtained separately for each cluster and using DAVID functional annotation tool API, GO and pathway annotations are collected per cluster and these are saved separately.
http://david.abcc.ncifcrf.gov/api.jsp?type=OFFICIAL_GENE_SYMBOL&tool=chartReport&annot=GOTERM_BP_FAT,GOTERM_CC_FAT,GOTERM_MF_FAT,BBID,BIOCARTA,KEGG_PATHWAY&ids=HGNC_NAME1,HGNC_NAME2,HGNC_NAME3,... Above URL was used to obtain chart report for some GO and pathways chart records.
Tag: clustering
Blog
GO Enrichment of Network Clusters
In my previous post, I mentioned how I clustered the network we obtained at the end. For functional annotation gene ontology (GO) enrichment has been done on these clusters.
There were 20 clusters and the HGNC names are obtained separately for each cluster and using DAVID functional annotation tool API, GO and pathway annotations are collected per cluster and these are saved separately.
http://david.abcc.ncifcrf.gov/api.jsp?type=OFFICIAL_GENE_SYMBOL&tool=chartReport&annot=GOTERM_BP_FAT,GOTERM_CC_FAT,GOTERM_MF_FAT,BBID,BIOCARTA,KEGG_PATHWAY&ids=HGNC_NAME1,HGNC_NAME2,HGNC_NAME3,... Above URL was used to obtain chart report for some GO and pathways chart records.
Blog
Network Clustering with NeAT - RNSC Algorithm
As we have obtained proteins at different times points from the experimental data, then we have found intermediate nodes (from human interactome) using PCSF algorithm and finally with a special matrix from the network that PCSF created, we have validated the edges and also determined edge directions using an approach which a divide and conquer (ILP) approach for construction of large-scale signaling networks from PPI data. The resulting network is a directed network and will be used and visualized for further analyses.
Blog
Finding k-cores and Clustering Coefficient Computation with NetworkX
Assume you have a large network and you want to find k-cores of each node and also you want to compute clustering coefficient for each one. Python package NetworkX comes with very nice methods for you to easily do these.
k-core is a maximal subgraph whose nodes are at least k degree [1]. To find k-cores:
Add all edges you have in your network in a NetworkX graph, and use core_number method that gets graph as the single input and returns node – k-core pairs.
Blog
UPGMA Algorithm Described - Unweighted Pair-Group Method with Arithmetic Mean
UPGMA is an agglomerative clustering algorithm that is ultrametric (assumes a molecular clock - all lineages are evolving at a constant rate) by Sokal and Michener in 1958.
The idea is to continue iteration until only one cluster is obtained and at each iteration, join two nearest clusters (which become a higher cluster). The distance between any two clusters are calculated by averaging distances between elements of each cluster.
To understand better, see UPGMA worked example by Dr Richard Edwards.
Tag: david
Blog
GO Enrichment of Network Clusters
In my previous post, I mentioned how I clustered the network we obtained at the end. For functional annotation gene ontology (GO) enrichment has been done on these clusters.
There were 20 clusters and the HGNC names are obtained separately for each cluster and using DAVID functional annotation tool API, GO and pathway annotations are collected per cluster and these are saved separately.
http://david.abcc.ncifcrf.gov/api.jsp?type=OFFICIAL_GENE_SYMBOL&tool=chartReport&annot=GOTERM_BP_FAT,GOTERM_CC_FAT,GOTERM_MF_FAT,BBID,BIOCARTA,KEGG_PATHWAY&ids=HGNC_NAME1,HGNC_NAME2,HGNC_NAME3,... Above URL was used to obtain chart report for some GO and pathways chart records.
Blog
Salmonella Data Preprocessing for PCSF Algorithm
This post describes data preprocessing in Salmonella project for Prize-Collecting Steiner Forest Problem (PCSF) algorithm.
Salmonella data taken from Table S6 in Phosphoproteomic Analysis of Salmonella-Infected Cells Identifies Key Kinase Regulators and SopB-Dependent Host Phosphorylation Events by Rogers, LD et al. has been converted to tab delimited TXT file from its original XLS file for easy reading in Python.
The data should be separated into time points files (2, 5, 10 and 20 minutes) each of which will contain corresponding phophoproteins and their fold changes.
Blog
Data Preprocessing I for Salmon Project
Since we’ll be using R for most of the analyses, we converted XLS data file to CSV using MS Office Excel 2013 and then we had to fix several lines using Sublime Text 2 because three colums in these lines were left unquoted which later created a problem reading in RStudio.
The data contains phosphorylation data of 8553 peptides. There are many missing data points for many peptides and since IPI IDs were used for peptides and these are not supported now, we had to convert IPI IDs to HGNC approved symbols although data had these symbols as names but they looked outdated.
Tag: functional annotation
Blog
GO Enrichment of Network Clusters
In my previous post, I mentioned how I clustered the network we obtained at the end. For functional annotation gene ontology (GO) enrichment has been done on these clusters.
There were 20 clusters and the HGNC names are obtained separately for each cluster and using DAVID functional annotation tool API, GO and pathway annotations are collected per cluster and these are saved separately.
http://david.abcc.ncifcrf.gov/api.jsp?type=OFFICIAL_GENE_SYMBOL&tool=chartReport&annot=GOTERM_BP_FAT,GOTERM_CC_FAT,GOTERM_MF_FAT,BBID,BIOCARTA,KEGG_PATHWAY&ids=HGNC_NAME1,HGNC_NAME2,HGNC_NAME3,... Above URL was used to obtain chart report for some GO and pathways chart records.
Tag: gene ontology
Blog
GO Enrichment of Network Clusters
In my previous post, I mentioned how I clustered the network we obtained at the end. For functional annotation gene ontology (GO) enrichment has been done on these clusters.
There were 20 clusters and the HGNC names are obtained separately for each cluster and using DAVID functional annotation tool API, GO and pathway annotations are collected per cluster and these are saved separately.
http://david.abcc.ncifcrf.gov/api.jsp?type=OFFICIAL_GENE_SYMBOL&tool=chartReport&annot=GOTERM_BP_FAT,GOTERM_CC_FAT,GOTERM_MF_FAT,BBID,BIOCARTA,KEGG_PATHWAY&ids=HGNC_NAME1,HGNC_NAME2,HGNC_NAME3,... Above URL was used to obtain chart report for some GO and pathways chart records.
Tag: go
Blog
GO Enrichment of Network Clusters
In my previous post, I mentioned how I clustered the network we obtained at the end. For functional annotation gene ontology (GO) enrichment has been done on these clusters.
There were 20 clusters and the HGNC names are obtained separately for each cluster and using DAVID functional annotation tool API, GO and pathway annotations are collected per cluster and these are saved separately.
http://david.abcc.ncifcrf.gov/api.jsp?type=OFFICIAL_GENE_SYMBOL&tool=chartReport&annot=GOTERM_BP_FAT,GOTERM_CC_FAT,GOTERM_MF_FAT,BBID,BIOCARTA,KEGG_PATHWAY&ids=HGNC_NAME1,HGNC_NAME2,HGNC_NAME3,... Above URL was used to obtain chart report for some GO and pathways chart records.
Tag: hgnc
Blog
GO Enrichment of Network Clusters
In my previous post, I mentioned how I clustered the network we obtained at the end. For functional annotation gene ontology (GO) enrichment has been done on these clusters.
There were 20 clusters and the HGNC names are obtained separately for each cluster and using DAVID functional annotation tool API, GO and pathway annotations are collected per cluster and these are saved separately.
http://david.abcc.ncifcrf.gov/api.jsp?type=OFFICIAL_GENE_SYMBOL&tool=chartReport&annot=GOTERM_BP_FAT,GOTERM_CC_FAT,GOTERM_MF_FAT,BBID,BIOCARTA,KEGG_PATHWAY&ids=HGNC_NAME1,HGNC_NAME2,HGNC_NAME3,... Above URL was used to obtain chart report for some GO and pathways chart records.
Blog
Reconstructed Salmonella Signaling Network Visualized and Colored
After fold changes were obtained and HGNC names were found for each phosphopeptide, these were used to construct Salmonella signaling network using PCSF and then with the nodes that PCSF found as well, we generated a matrix which has node in the rows and time points in the columns and each cell shows the presence of corresponding protein under the corresponding time point(s).
The matrix has 658 nodes (proteins) and 4 time points as indicated before: 2 min, 5 min, 10 min and 20 min.
Blog
Salmonella Data Preprocessing for PCSF Algorithm
This post describes data preprocessing in Salmonella project for Prize-Collecting Steiner Forest Problem (PCSF) algorithm.
Salmonella data taken from Table S6 in Phosphoproteomic Analysis of Salmonella-Infected Cells Identifies Key Kinase Regulators and SopB-Dependent Host Phosphorylation Events by Rogers, LD et al. has been converted to tab delimited TXT file from its original XLS file for easy reading in Python.
The data should be separated into time points files (2, 5, 10 and 20 minutes) each of which will contain corresponding phophoproteins and their fold changes.
Blog
Data Preprocessing I for Salmon Project
Since we’ll be using R for most of the analyses, we converted XLS data file to CSV using MS Office Excel 2013 and then we had to fix several lines using Sublime Text 2 because three colums in these lines were left unquoted which later created a problem reading in RStudio.
The data contains phosphorylation data of 8553 peptides. There are many missing data points for many peptides and since IPI IDs were used for peptides and these are not supported now, we had to convert IPI IDs to HGNC approved symbols although data had these symbols as names but they looked outdated.
Tag: network clustering
Blog
GO Enrichment of Network Clusters
In my previous post, I mentioned how I clustered the network we obtained at the end. For functional annotation gene ontology (GO) enrichment has been done on these clusters.
There were 20 clusters and the HGNC names are obtained separately for each cluster and using DAVID functional annotation tool API, GO and pathway annotations are collected per cluster and these are saved separately.
http://david.abcc.ncifcrf.gov/api.jsp?type=OFFICIAL_GENE_SYMBOL&tool=chartReport&annot=GOTERM_BP_FAT,GOTERM_CC_FAT,GOTERM_MF_FAT,BBID,BIOCARTA,KEGG_PATHWAY&ids=HGNC_NAME1,HGNC_NAME2,HGNC_NAME3,... Above URL was used to obtain chart report for some GO and pathways chart records.
Blog
Network Clustering with NeAT - RNSC Algorithm
As we have obtained proteins at different times points from the experimental data, then we have found intermediate nodes (from human interactome) using PCSF algorithm and finally with a special matrix from the network that PCSF created, we have validated the edges and also determined edge directions using an approach which a divide and conquer (ILP) approach for construction of large-scale signaling networks from PPI data. The resulting network is a directed network and will be used and visualized for further analyses.
Tag: ilp
Blog
Network Clustering with NeAT - RNSC Algorithm
As we have obtained proteins at different times points from the experimental data, then we have found intermediate nodes (from human interactome) using PCSF algorithm and finally with a special matrix from the network that PCSF created, we have validated the edges and also determined edge directions using an approach which a divide and conquer (ILP) approach for construction of large-scale signaling networks from PPI data. The resulting network is a directed network and will be used and visualized for further analyses.
Tag: neat
Blog
Network Clustering with NeAT - RNSC Algorithm
As we have obtained proteins at different times points from the experimental data, then we have found intermediate nodes (from human interactome) using PCSF algorithm and finally with a special matrix from the network that PCSF created, we have validated the edges and also determined edge directions using an approach which a divide and conquer (ILP) approach for construction of large-scale signaling networks from PPI data. The resulting network is a directed network and will be used and visualized for further analyses.
Tag: rnsc
Blog
Network Clustering with NeAT - RNSC Algorithm
As we have obtained proteins at different times points from the experimental data, then we have found intermediate nodes (from human interactome) using PCSF algorithm and finally with a special matrix from the network that PCSF created, we have validated the edges and also determined edge directions using an approach which a divide and conquer (ILP) approach for construction of large-scale signaling networks from PPI data. The resulting network is a directed network and will be used and visualized for further analyses.
Tag: clustering coefficient
Blog
Finding k-cores and Clustering Coefficient Computation with NetworkX
Assume you have a large network and you want to find k-cores of each node and also you want to compute clustering coefficient for each one. Python package NetworkX comes with very nice methods for you to easily do these.
k-core is a maximal subgraph whose nodes are at least k degree [1]. To find k-cores:
Add all edges you have in your network in a NetworkX graph, and use core_number method that gets graph as the single input and returns node – k-core pairs.
Tag: core_number
Blog
Finding k-cores and Clustering Coefficient Computation with NetworkX
Assume you have a large network and you want to find k-cores of each node and also you want to compute clustering coefficient for each one. Python package NetworkX comes with very nice methods for you to easily do these.
k-core is a maximal subgraph whose nodes are at least k degree [1]. To find k-cores:
Add all edges you have in your network in a NetworkX graph, and use core_number method that gets graph as the single input and returns node – k-core pairs.
Tag: cores
Blog
Finding k-cores and Clustering Coefficient Computation with NetworkX
Assume you have a large network and you want to find k-cores of each node and also you want to compute clustering coefficient for each one. Python package NetworkX comes with very nice methods for you to easily do these.
k-core is a maximal subgraph whose nodes are at least k degree [1]. To find k-cores:
Add all edges you have in your network in a NetworkX graph, and use core_number method that gets graph as the single input and returns node – k-core pairs.
Tag: k-core
Blog
Finding k-cores and Clustering Coefficient Computation with NetworkX
Assume you have a large network and you want to find k-cores of each node and also you want to compute clustering coefficient for each one. Python package NetworkX comes with very nice methods for you to easily do these.
k-core is a maximal subgraph whose nodes are at least k degree [1]. To find k-cores:
Add all edges you have in your network in a NetworkX graph, and use core_number method that gets graph as the single input and returns node – k-core pairs.
Tag: codon
Blog
Searching Open Reading Frames (ORF) in DNA sequences - ORF Finder
Open reading frames (ORF) are regions on DNA which are translated into protein. They are in between start and stop codons and they are usually long.
The Python script below searches for ORFs in six frames and returns the longest one. It doesn’t consider start codon as a delimiter and only splits the sequence by stop codons. So the ORF can start with any codon but ends with a stop codon (TAG, TGA, TAA).
Tag: dna
Blog
Searching Open Reading Frames (ORF) in DNA sequences - ORF Finder
Open reading frames (ORF) are regions on DNA which are translated into protein. They are in between start and stop codons and they are usually long.
The Python script below searches for ORFs in six frames and returns the longest one. It doesn’t consider start codon as a delimiter and only splits the sequence by stop codons. So the ORF can start with any codon but ends with a stop codon (TAG, TGA, TAA).
Blog
Dizileme Çalışmalarını Kirleten Organizmaları Tespit Etme
Bu yaz stajımda ilk olarak başlayacağım çalışma yavaş yavaş şekilleniyor. Bu çalışmada bir pipeline oluşturup, bunu laboratuvarlarda dizileme (sequencing) örneklerini kirleten organizmaları bulmaya çalışacağım.
Laboratuvarlarda birçok nedenden dolayı örnekler başka organizmalar ya da yabancı DNA tarafından kirlenebiliyor. Bunlar bakteri, maya olabilir ya da bir virüs DNA’sı da olabilir. Siz bir DNA’yı diziledikten sonra onun referansıyla eşleştirme çok az oranda çıkabiliyor. Bu da yabancı DNA’nın olabileceğini gösteriyor. Bir başka neden referans DNA’nın farklı olması da olabilir.
Tag: open reading frames
Blog
Searching Open Reading Frames (ORF) in DNA sequences - ORF Finder
Open reading frames (ORF) are regions on DNA which are translated into protein. They are in between start and stop codons and they are usually long.
The Python script below searches for ORFs in six frames and returns the longest one. It doesn’t consider start codon as a delimiter and only splits the sequence by stop codons. So the ORF can start with any codon but ends with a stop codon (TAG, TGA, TAA).
Tag: orf
Blog
Searching Open Reading Frames (ORF) in DNA sequences - ORF Finder
Open reading frames (ORF) are regions on DNA which are translated into protein. They are in between start and stop codons and they are usually long.
The Python script below searches for ORFs in six frames and returns the longest one. It doesn’t consider start codon as a delimiter and only splits the sequence by stop codons. So the ORF can start with any codon but ends with a stop codon (TAG, TGA, TAA).
Tag: sequence
Blog
Searching Open Reading Frames (ORF) in DNA sequences - ORF Finder
Open reading frames (ORF) are regions on DNA which are translated into protein. They are in between start and stop codons and they are usually long.
The Python script below searches for ORFs in six frames and returns the longest one. It doesn’t consider start codon as a delimiter and only splits the sequence by stop codons. So the ORF can start with any codon but ends with a stop codon (TAG, TGA, TAA).
Tag: taa
Blog
Searching Open Reading Frames (ORF) in DNA sequences - ORF Finder
Open reading frames (ORF) are regions on DNA which are translated into protein. They are in between start and stop codons and they are usually long.
The Python script below searches for ORFs in six frames and returns the longest one. It doesn’t consider start codon as a delimiter and only splits the sequence by stop codons. So the ORF can start with any codon but ends with a stop codon (TAG, TGA, TAA).
Tag: tag
Blog
Searching Open Reading Frames (ORF) in DNA sequences - ORF Finder
Open reading frames (ORF) are regions on DNA which are translated into protein. They are in between start and stop codons and they are usually long.
The Python script below searches for ORFs in six frames and returns the longest one. It doesn’t consider start codon as a delimiter and only splits the sequence by stop codons. So the ORF can start with any codon but ends with a stop codon (TAG, TGA, TAA).
Tag: tga
Blog
Searching Open Reading Frames (ORF) in DNA sequences - ORF Finder
Open reading frames (ORF) are regions on DNA which are translated into protein. They are in between start and stop codons and they are usually long.
The Python script below searches for ORFs in six frames and returns the longest one. It doesn’t consider start codon as a delimiter and only splits the sequence by stop codons. So the ORF can start with any codon but ends with a stop codon (TAG, TGA, TAA).
Tag: frontiers
Blog
Reconstructed Salmonella Signaling Network Visualized and Colored
After fold changes were obtained and HGNC names were found for each phosphopeptide, these were used to construct Salmonella signaling network using PCSF and then with the nodes that PCSF found as well, we generated a matrix which has node in the rows and time points in the columns and each cell shows the presence of corresponding protein under the corresponding time point(s).
The matrix has 658 nodes (proteins) and 4 time points as indicated before: 2 min, 5 min, 10 min and 20 min.
Blog
Salmonella Data Preprocessing for PCSF Algorithm
This post describes data preprocessing in Salmonella project for Prize-Collecting Steiner Forest Problem (PCSF) algorithm.
Salmonella data taken from Table S6 in Phosphoproteomic Analysis of Salmonella-Infected Cells Identifies Key Kinase Regulators and SopB-Dependent Host Phosphorylation Events by Rogers, LD et al. has been converted to tab delimited TXT file from its original XLS file for easy reading in Python.
The data should be separated into time points files (2, 5, 10 and 20 minutes) each of which will contain corresponding phophoproteins and their fold changes.
Tag: metu
Blog
Reconstructed Salmonella Signaling Network Visualized and Colored
After fold changes were obtained and HGNC names were found for each phosphopeptide, these were used to construct Salmonella signaling network using PCSF and then with the nodes that PCSF found as well, we generated a matrix which has node in the rows and time points in the columns and each cell shows the presence of corresponding protein under the corresponding time point(s).
The matrix has 658 nodes (proteins) and 4 time points as indicated before: 2 min, 5 min, 10 min and 20 min.
Tag: network construction
Blog
Reconstructed Salmonella Signaling Network Visualized and Colored
After fold changes were obtained and HGNC names were found for each phosphopeptide, these were used to construct Salmonella signaling network using PCSF and then with the nodes that PCSF found as well, we generated a matrix which has node in the rows and time points in the columns and each cell shows the presence of corresponding protein under the corresponding time point(s).
The matrix has 658 nodes (proteins) and 4 time points as indicated before: 2 min, 5 min, 10 min and 20 min.
Tag: salmonella
Blog
Reconstructed Salmonella Signaling Network Visualized and Colored
After fold changes were obtained and HGNC names were found for each phosphopeptide, these were used to construct Salmonella signaling network using PCSF and then with the nodes that PCSF found as well, we generated a matrix which has node in the rows and time points in the columns and each cell shows the presence of corresponding protein under the corresponding time point(s).
The matrix has 658 nodes (proteins) and 4 time points as indicated before: 2 min, 5 min, 10 min and 20 min.
Blog
Salmonella Data Preprocessing for PCSF Algorithm
This post describes data preprocessing in Salmonella project for Prize-Collecting Steiner Forest Problem (PCSF) algorithm.
Salmonella data taken from Table S6 in Phosphoproteomic Analysis of Salmonella-Infected Cells Identifies Key Kinase Regulators and SopB-Dependent Host Phosphorylation Events by Rogers, LD et al. has been converted to tab delimited TXT file from its original XLS file for easy reading in Python.
The data should be separated into time points files (2, 5, 10 and 20 minutes) each of which will contain corresponding phophoproteins and their fold changes.
Blog
Data Preprocessing II for Salmon Project
So in our Multi-dimensional Modeling and Reconstruction of Signaling Networks in Salmonella-infected Human Cells project, we have several methods to construct the networks so the data is still needed to be preprocessed so that it can be ready to be analyzed with these methods.
One method needed to have a matrix first row as protein name and time series (2 min, 5 min, 10 min, 20 min), and the values of the proteins in each time series were to be 1 or 0 according to variance, significance and the size of fold change.
Blog
Data Preprocessing I for Salmon Project
Since we’ll be using R for most of the analyses, we converted XLS data file to CSV using MS Office Excel 2013 and then we had to fix several lines using Sublime Text 2 because three colums in these lines were left unquoted which later created a problem reading in RStudio.
The data contains phosphorylation data of 8553 peptides. There are many missing data points for many peptides and since IPI IDs were used for peptides and these are not supported now, we had to convert IPI IDs to HGNC approved symbols although data had these symbols as names but they looked outdated.
Blog
Multi-dimensional Modeling and Reconstruction of Signaling Networks in Salmonella-infected Human Cells
In this study, we’re going to use a phosphorylation data from a research paper on phosphoproteomic analysis of related cells.
The idea is to use and compare existing methods and develop these methods to be able to better understand the nature of signaling events in these cells and to find key proteins that might be targets for disease diagnosis, prevention and treatment.
This study will be submitted as a research paper so I’m not going to publish any results here for now but I’ll mention the struggles I have and solutions I try to solve them.
Tag: list
Blog
Python: Get Longest String in a List
Here is a quick Python trick you might use in your code.
Assume you have a list of strings and you want to get the longest one in the most efficient way.
1>>>l=["aaa", "bb", "c"] 2>>>longest_string = max(l, key = len) 3>>>longest_string 4'aaa'
Tag: longest string
Blog
Python: Get Longest String in a List
Here is a quick Python trick you might use in your code.
Assume you have a list of strings and you want to get the longest one in the most efficient way.
1>>>l=["aaa", "bb", "c"] 2>>>longest_string = max(l, key = len) 3>>>longest_string 4'aaa'
Tag: string
Blog
Python: Get Longest String in a List
Here is a quick Python trick you might use in your code.
Assume you have a list of strings and you want to get the longest one in the most efficient way.
1>>>l=["aaa", "bb", "c"] 2>>>longest_string = max(l, key = len) 3>>>longest_string 4'aaa'
Blog
SRS'de Coklu Arama Yapmak
Inceleme yapan scriptin en son hali, oncekilere gore daha fazla okuma inceliyor oldugu icin her okuma icin SRS uzerinde isim aramak oldukca zaman alan bir islemdi. Oyle ki, son inceleme 4 gun surdu.
Bunu azaltmak icin inceleme scriptini tamamen degistirdim. Oncelikle her zaman oldugu gibi esik degerini gecenleri aliyor ama direkt bunlarin ID numaralarini bir dizide (array) listeliyorum. Daha sonra bu listenin herbir elemanini boru karakteri ile ayirarak bir string haline getiriyorum.
Tag: defaultdict
Blog
Python: defaultdict(list) Dictionary of Lists
Most of the time, when you need to work on large data, you’ll have to use some dictionaries in Python. Dictionaries of lists are very useful to store large data in very organized way. You can always initiate them by initiating empty lists inside an empty dictionary but when you don’t know how many of them you’ll end up with and if you want an easier option, use defaultdict(list). You just need to import it, first:
Tag: dictionary of lists
Blog
Python: defaultdict(list) Dictionary of Lists
Most of the time, when you need to work on large data, you’ll have to use some dictionaries in Python. Dictionaries of lists are very useful to store large data in very organized way. You can always initiate them by initiating empty lists inside an empty dictionary but when you don’t know how many of them you’ll end up with and if you want an easier option, use defaultdict(list). You just need to import it, first:
Tag: append
Blog
Python: extend() Append Elements of a List to a List
When you append a list to a list by using append() method, you’ll see your list is going to be appended as a list:
1>>>l=["a"] 2>>>l2=["a", "b"] 3>>>l.append(l2) 4>>>l 5['a', ['a', 'b']] If you want to append elements of the list directly without creating nested lists, use extend() method:
1>>>l=["a"] 2>>>l2=["a", "b"] 3>>>l.extend(l2) 4>>>l 5['a', 'a', 'b']
Tag: append elements to a list
Blog
Python: extend() Append Elements of a List to a List
When you append a list to a list by using append() method, you’ll see your list is going to be appended as a list:
1>>>l=["a"] 2>>>l2=["a", "b"] 3>>>l.append(l2) 4>>>l 5['a', ['a', 'b']] If you want to append elements of the list directly without creating nested lists, use extend() method:
1>>>l=["a"] 2>>>l2=["a", "b"] 3>>>l.extend(l2) 4>>>l 5['a', 'a', 'b']
Tag: extend
Blog
Python: extend() Append Elements of a List to a List
When you append a list to a list by using append() method, you’ll see your list is going to be appended as a list:
1>>>l=["a"] 2>>>l2=["a", "b"] 3>>>l.append(l2) 4>>>l 5['a', ['a', 'b']] If you want to append elements of the list directly without creating nested lists, use extend() method:
1>>>l=["a"] 2>>>l2=["a", "b"] 3>>>l.extend(l2) 4>>>l 5['a', 'a', 'b']
Tag: list of elements
Blog
Python: extend() Append Elements of a List to a List
When you append a list to a list by using append() method, you’ll see your list is going to be appended as a list:
1>>>l=["a"] 2>>>l2=["a", "b"] 3>>>l.append(l2) 4>>>l 5['a', ['a', 'b']] If you want to append elements of the list directly without creating nested lists, use extend() method:
1>>>l=["a"] 2>>>l2=["a", "b"] 3>>>l.extend(l2) 4>>>l 5['a', 'a', 'b']
Tag: python methods
Blog
Python: extend() Append Elements of a List to a List
When you append a list to a list by using append() method, you’ll see your list is going to be appended as a list:
1>>>l=["a"] 2>>>l2=["a", "b"] 3>>>l.append(l2) 4>>>l 5['a', ['a', 'b']] If you want to append elements of the list directly without creating nested lists, use extend() method:
1>>>l=["a"] 2>>>l2=["a", "b"] 3>>>l.extend(l2) 4>>>l 5['a', 'a', 'b']
Tag: data preprocessing
Blog
Salmonella Data Preprocessing for PCSF Algorithm
This post describes data preprocessing in Salmonella project for Prize-Collecting Steiner Forest Problem (PCSF) algorithm.
Salmonella data taken from Table S6 in Phosphoproteomic Analysis of Salmonella-Infected Cells Identifies Key Kinase Regulators and SopB-Dependent Host Phosphorylation Events by Rogers, LD et al. has been converted to tab delimited TXT file from its original XLS file for easy reading in Python.
The data should be separated into time points files (2, 5, 10 and 20 minutes) each of which will contain corresponding phophoproteins and their fold changes.
Tag: txt
Blog
Salmonella Data Preprocessing for PCSF Algorithm
This post describes data preprocessing in Salmonella project for Prize-Collecting Steiner Forest Problem (PCSF) algorithm.
Salmonella data taken from Table S6 in Phosphoproteomic Analysis of Salmonella-Infected Cells Identifies Key Kinase Regulators and SopB-Dependent Host Phosphorylation Events by Rogers, LD et al. has been converted to tab delimited TXT file from its original XLS file for easy reading in Python.
The data should be separated into time points files (2, 5, 10 and 20 minutes) each of which will contain corresponding phophoproteins and their fold changes.
Tag: clustering algorithms
Blog
UPGMA Algorithm Described - Unweighted Pair-Group Method with Arithmetic Mean
UPGMA is an agglomerative clustering algorithm that is ultrametric (assumes a molecular clock - all lineages are evolving at a constant rate) by Sokal and Michener in 1958.
The idea is to continue iteration until only one cluster is obtained and at each iteration, join two nearest clusters (which become a higher cluster). The distance between any two clusters are calculated by averaging distances between elements of each cluster.
To understand better, see UPGMA worked example by Dr Richard Edwards.
Tag: molecular clock
Blog
UPGMA Algorithm Described - Unweighted Pair-Group Method with Arithmetic Mean
UPGMA is an agglomerative clustering algorithm that is ultrametric (assumes a molecular clock - all lineages are evolving at a constant rate) by Sokal and Michener in 1958.
The idea is to continue iteration until only one cluster is obtained and at each iteration, join two nearest clusters (which become a higher cluster). The distance between any two clusters are calculated by averaging distances between elements of each cluster.
To understand better, see UPGMA worked example by Dr Richard Edwards.
Tag: ultrametric
Blog
UPGMA Algorithm Described - Unweighted Pair-Group Method with Arithmetic Mean
UPGMA is an agglomerative clustering algorithm that is ultrametric (assumes a molecular clock - all lineages are evolving at a constant rate) by Sokal and Michener in 1958.
The idea is to continue iteration until only one cluster is obtained and at each iteration, join two nearest clusters (which become a higher cluster). The distance between any two clusters are calculated by averaging distances between elements of each cluster.
To understand better, see UPGMA worked example by Dr Richard Edwards.
Tag: unweighted pair-group Method with arithmetic mean
Blog
UPGMA Algorithm Described - Unweighted Pair-Group Method with Arithmetic Mean
UPGMA is an agglomerative clustering algorithm that is ultrametric (assumes a molecular clock - all lineages are evolving at a constant rate) by Sokal and Michener in 1958.
The idea is to continue iteration until only one cluster is obtained and at each iteration, join two nearest clusters (which become a higher cluster). The distance between any two clusters are calculated by averaging distances between elements of each cluster.
To understand better, see UPGMA worked example by Dr Richard Edwards.
Tag: upgma
Blog
UPGMA Algorithm Described - Unweighted Pair-Group Method with Arithmetic Mean
UPGMA is an agglomerative clustering algorithm that is ultrametric (assumes a molecular clock - all lineages are evolving at a constant rate) by Sokal and Michener in 1958.
The idea is to continue iteration until only one cluster is obtained and at each iteration, join two nearest clusters (which become a higher cluster). The distance between any two clusters are calculated by averaging distances between elements of each cluster.
To understand better, see UPGMA worked example by Dr Richard Edwards.
Tag: upgma algorithm
Blog
UPGMA Algorithm Described - Unweighted Pair-Group Method with Arithmetic Mean
UPGMA is an agglomerative clustering algorithm that is ultrametric (assumes a molecular clock - all lineages are evolving at a constant rate) by Sokal and Michener in 1958.
The idea is to continue iteration until only one cluster is obtained and at each iteration, join two nearest clusters (which become a higher cluster). The distance between any two clusters are calculated by averaging distances between elements of each cluster.
To understand better, see UPGMA worked example by Dr Richard Edwards.
Tag: biopdb
Blog
Structural Superimposition of Local Sequence Alignment using BioPython
This task was given to me as a homework in one of my courses at the university and I wanted to share my solution as I saw there is no such entry on the Internet.
Objectives here are;
Download (two) PDB files automatically from the server Do the pairwise alignment after getting their amino acid sequences Superimpose them and report RMSD Bio.PDB module from BioPython works very well in this case.
Tag: biopython
Blog
Structural Superimposition of Local Sequence Alignment using BioPython
This task was given to me as a homework in one of my courses at the university and I wanted to share my solution as I saw there is no such entry on the Internet.
Objectives here are;
Download (two) PDB files automatically from the server Do the pairwise alignment after getting their amino acid sequences Superimpose them and report RMSD Bio.PDB module from BioPython works very well in this case.
Tag: local alignment
Blog
Structural Superimposition of Local Sequence Alignment using BioPython
This task was given to me as a homework in one of my courses at the university and I wanted to share my solution as I saw there is no such entry on the Internet.
Objectives here are;
Download (two) PDB files automatically from the server Do the pairwise alignment after getting their amino acid sequences Superimpose them and report RMSD Bio.PDB module from BioPython works very well in this case.
Tag: pairwise sequence alignment
Blog
Structural Superimposition of Local Sequence Alignment using BioPython
This task was given to me as a homework in one of my courses at the university and I wanted to share my solution as I saw there is no such entry on the Internet.
Objectives here are;
Download (two) PDB files automatically from the server Do the pairwise alignment after getting their amino acid sequences Superimpose them and report RMSD Bio.PDB module from BioPython works very well in this case.
Tag: pdb
Blog
Structural Superimposition of Local Sequence Alignment using BioPython
This task was given to me as a homework in one of my courses at the university and I wanted to share my solution as I saw there is no such entry on the Internet.
Objectives here are;
Download (two) PDB files automatically from the server Do the pairwise alignment after getting their amino acid sequences Superimpose them and report RMSD Bio.PDB module from BioPython works very well in this case.
Tag: rmsd
Blog
Structural Superimposition of Local Sequence Alignment using BioPython
This task was given to me as a homework in one of my courses at the university and I wanted to share my solution as I saw there is no such entry on the Internet.
Objectives here are;
Download (two) PDB files automatically from the server Do the pairwise alignment after getting their amino acid sequences Superimpose them and report RMSD Bio.PDB module from BioPython works very well in this case.
Tag: structural superimposition
Blog
Structural Superimposition of Local Sequence Alignment using BioPython
This task was given to me as a homework in one of my courses at the university and I wanted to share my solution as I saw there is no such entry on the Internet.
Objectives here are;
Download (two) PDB files automatically from the server Do the pairwise alignment after getting their amino acid sequences Superimpose them and report RMSD Bio.PDB module from BioPython works very well in this case.
Tag: how to install openpyxl
Blog
How to Install openpyxl on Windows
openpyxl is a Python library to read/write Excel 2007 xlsx/xlsm files. To download and install on Windows:
Download it from Python Packages
Then to install, extract the tar ball you downloaded, open up CMD, navigate to the folder that you extracted and run the following:
C:\Users\Gungor>cd Downloads\openpyxl-2.1.2.tar\dist\openpyxl-2.1.2\openpyxl-2.1.2 C:\Users\Gungor\Downloads\openpyxl-2.1.2.tar\dist\openpyxl-2.1.2\openpyxl-2.1.2>python setup.py install It’s going to install everything and will report any error. If there is nothing that seems like an error. You’re good to go.
Tag: openpyxl
Blog
How to Install openpyxl on Windows
openpyxl is a Python library to read/write Excel 2007 xlsx/xlsm files. To download and install on Windows:
Download it from Python Packages
Then to install, extract the tar ball you downloaded, open up CMD, navigate to the folder that you extracted and run the following:
C:\Users\Gungor>cd Downloads\openpyxl-2.1.2.tar\dist\openpyxl-2.1.2\openpyxl-2.1.2 C:\Users\Gungor\Downloads\openpyxl-2.1.2.tar\dist\openpyxl-2.1.2\openpyxl-2.1.2>python setup.py install It’s going to install everything and will report any error. If there is nothing that seems like an error. You’re good to go.
Tag: python excel
Blog
How to Install openpyxl on Windows
openpyxl is a Python library to read/write Excel 2007 xlsx/xlsm files. To download and install on Windows:
Download it from Python Packages
Then to install, extract the tar ball you downloaded, open up CMD, navigate to the folder that you extracted and run the following:
C:\Users\Gungor>cd Downloads\openpyxl-2.1.2.tar\dist\openpyxl-2.1.2\openpyxl-2.1.2 C:\Users\Gungor\Downloads\openpyxl-2.1.2.tar\dist\openpyxl-2.1.2\openpyxl-2.1.2>python setup.py install It’s going to install everything and will report any error. If there is nothing that seems like an error. You’re good to go.
Tag: python read excel
Blog
How to Install openpyxl on Windows
openpyxl is a Python library to read/write Excel 2007 xlsx/xlsm files. To download and install on Windows:
Download it from Python Packages
Then to install, extract the tar ball you downloaded, open up CMD, navigate to the folder that you extracted and run the following:
C:\Users\Gungor>cd Downloads\openpyxl-2.1.2.tar\dist\openpyxl-2.1.2\openpyxl-2.1.2 C:\Users\Gungor\Downloads\openpyxl-2.1.2.tar\dist\openpyxl-2.1.2\openpyxl-2.1.2>python setup.py install It’s going to install everything and will report any error. If there is nothing that seems like an error. You’re good to go.
Tag: python write excel
Blog
How to Install openpyxl on Windows
openpyxl is a Python library to read/write Excel 2007 xlsx/xlsm files. To download and install on Windows:
Download it from Python Packages
Then to install, extract the tar ball you downloaded, open up CMD, navigate to the folder that you extracted and run the following:
C:\Users\Gungor>cd Downloads\openpyxl-2.1.2.tar\dist\openpyxl-2.1.2\openpyxl-2.1.2 C:\Users\Gungor\Downloads\openpyxl-2.1.2.tar\dist\openpyxl-2.1.2\openpyxl-2.1.2>python setup.py install It’s going to install everything and will report any error. If there is nothing that seems like an error. You’re good to go.
Tag: windows
Blog
How to Install openpyxl on Windows
openpyxl is a Python library to read/write Excel 2007 xlsx/xlsm files. To download and install on Windows:
Download it from Python Packages
Then to install, extract the tar ball you downloaded, open up CMD, navigate to the folder that you extracted and run the following:
C:\Users\Gungor>cd Downloads\openpyxl-2.1.2.tar\dist\openpyxl-2.1.2\openpyxl-2.1.2 C:\Users\Gungor\Downloads\openpyxl-2.1.2.tar\dist\openpyxl-2.1.2\openpyxl-2.1.2>python setup.py install It’s going to install everything and will report any error. If there is nothing that seems like an error. You’re good to go.
Blog
How to Install Numpy Python Package on Windows
Numpy (Numerical Python) is a great Python package that you should definitely make use of if you’re doing scientific computing
Installing it on Windows might be difficult if you don’t know how to do it via command line. There are unofficial Windows binaries for Numpy for Windows 32 and 64 bit which make it super easy to install.
Go to the link below and download the one for your system and Python version:http://www.
Blog
Set Up Google Cloud SDK on Windows using Cygwin
Windows isn’t the best environment for software development I believe but if you have to use it there are nice softwares to make it easy for you. Cygwin here will help us to use Google Cloud tools but installation requires certain things that you should be aware of beforehand.
You’ll need
Python latest 2.7.x Google Cloud SDK Cygwin 32-bit (i.e. setup-x86.exe - note only this one works) openssh, curl and latest 2.
Blog
How to Convert PLINK Binary Formats into Non-binary Formats
PLINK is a whole genome association analysis toolset and to save time and space, you need to convert your data files to binary formats (BED, FAM, BIM) but of course when you need to view the files, you have to convert them back to non-binary formats (PED, MAP) to be able to open them in your text editor such as Notepad on Windows OS.
This operation is really easy. It requires PLINK of course, and the following line of code written to DOS window (Run -> type cmd; hit ENTER) in the directory of PLINK:
Tag: how to install
Blog
How to Install Numpy Python Package on Windows
Numpy (Numerical Python) is a great Python package that you should definitely make use of if you’re doing scientific computing
Installing it on Windows might be difficult if you don’t know how to do it via command line. There are unofficial Windows binaries for Numpy for Windows 32 and 64 bit which make it super easy to install.
Go to the link below and download the one for your system and Python version:http://www.
Tag: numerical python
Blog
How to Install Numpy Python Package on Windows
Numpy (Numerical Python) is a great Python package that you should definitely make use of if you’re doing scientific computing
Installing it on Windows might be difficult if you don’t know how to do it via command line. There are unofficial Windows binaries for Numpy for Windows 32 and 64 bit which make it super easy to install.
Go to the link below and download the one for your system and Python version:http://www.
Tag: numpy
Blog
How to Install Numpy Python Package on Windows
Numpy (Numerical Python) is a great Python package that you should definitely make use of if you’re doing scientific computing
Installing it on Windows might be difficult if you don’t know how to do it via command line. There are unofficial Windows binaries for Numpy for Windows 32 and 64 bit which make it super easy to install.
Go to the link below and download the one for your system and Python version:http://www.
Tag: matrix
Blog
Data Preprocessing II for Salmon Project
So in our Multi-dimensional Modeling and Reconstruction of Signaling Networks in Salmonella-infected Human Cells project, we have several methods to construct the networks so the data is still needed to be preprocessed so that it can be ready to be analyzed with these methods.
One method needed to have a matrix first row as protein name and time series (2 min, 5 min, 10 min, 20 min), and the values of the proteins in each time series were to be 1 or 0 according to variance, significance and the size of fold change.
Tag: modeling
Blog
Data Preprocessing II for Salmon Project
So in our Multi-dimensional Modeling and Reconstruction of Signaling Networks in Salmonella-infected Human Cells project, we have several methods to construct the networks so the data is still needed to be preprocessed so that it can be ready to be analyzed with these methods.
One method needed to have a matrix first row as protein name and time series (2 min, 5 min, 10 min, 20 min), and the values of the proteins in each time series were to be 1 or 0 according to variance, significance and the size of fold change.
Blog
Data Preprocessing I for Salmon Project
Since we’ll be using R for most of the analyses, we converted XLS data file to CSV using MS Office Excel 2013 and then we had to fix several lines using Sublime Text 2 because three colums in these lines were left unquoted which later created a problem reading in RStudio.
The data contains phosphorylation data of 8553 peptides. There are many missing data points for many peptides and since IPI IDs were used for peptides and these are not supported now, we had to convert IPI IDs to HGNC approved symbols although data had these symbols as names but they looked outdated.
Blog
Multi-dimensional Modeling and Reconstruction of Signaling Networks in Salmonella-infected Human Cells
In this study, we’re going to use a phosphorylation data from a research paper on phosphoproteomic analysis of related cells.
The idea is to use and compare existing methods and develop these methods to be able to better understand the nature of signaling events in these cells and to find key proteins that might be targets for disease diagnosis, prevention and treatment.
This study will be submitted as a research paper so I’m not going to publish any results here for now but I’ll mention the struggles I have and solutions I try to solve them.
Tag: reconstruction
Blog
Data Preprocessing II for Salmon Project
So in our Multi-dimensional Modeling and Reconstruction of Signaling Networks in Salmonella-infected Human Cells project, we have several methods to construct the networks so the data is still needed to be preprocessed so that it can be ready to be analyzed with these methods.
One method needed to have a matrix first row as protein name and time series (2 min, 5 min, 10 min, 20 min), and the values of the proteins in each time series were to be 1 or 0 according to variance, significance and the size of fold change.
Blog
Data Preprocessing I for Salmon Project
Since we’ll be using R for most of the analyses, we converted XLS data file to CSV using MS Office Excel 2013 and then we had to fix several lines using Sublime Text 2 because three colums in these lines were left unquoted which later created a problem reading in RStudio.
The data contains phosphorylation data of 8553 peptides. There are many missing data points for many peptides and since IPI IDs were used for peptides and these are not supported now, we had to convert IPI IDs to HGNC approved symbols although data had these symbols as names but they looked outdated.
Blog
Multi-dimensional Modeling and Reconstruction of Signaling Networks in Salmonella-infected Human Cells
In this study, we’re going to use a phosphorylation data from a research paper on phosphoproteomic analysis of related cells.
The idea is to use and compare existing methods and develop these methods to be able to better understand the nature of signaling events in these cells and to find key proteins that might be targets for disease diagnosis, prevention and treatment.
This study will be submitted as a research paper so I’m not going to publish any results here for now but I’ll mention the struggles I have and solutions I try to solve them.
Tag: signaling
Blog
Data Preprocessing II for Salmon Project
So in our Multi-dimensional Modeling and Reconstruction of Signaling Networks in Salmonella-infected Human Cells project, we have several methods to construct the networks so the data is still needed to be preprocessed so that it can be ready to be analyzed with these methods.
One method needed to have a matrix first row as protein name and time series (2 min, 5 min, 10 min, 20 min), and the values of the proteins in each time series were to be 1 or 0 according to variance, significance and the size of fold change.
Blog
Data Preprocessing I for Salmon Project
Since we’ll be using R for most of the analyses, we converted XLS data file to CSV using MS Office Excel 2013 and then we had to fix several lines using Sublime Text 2 because three colums in these lines were left unquoted which later created a problem reading in RStudio.
The data contains phosphorylation data of 8553 peptides. There are many missing data points for many peptides and since IPI IDs were used for peptides and these are not supported now, we had to convert IPI IDs to HGNC approved symbols although data had these symbols as names but they looked outdated.
Blog
Multi-dimensional Modeling and Reconstruction of Signaling Networks in Salmonella-infected Human Cells
In this study, we’re going to use a phosphorylation data from a research paper on phosphoproteomic analysis of related cells.
The idea is to use and compare existing methods and develop these methods to be able to better understand the nature of signaling events in these cells and to find key proteins that might be targets for disease diagnosis, prevention and treatment.
This study will be submitted as a research paper so I’m not going to publish any results here for now but I’ll mention the struggles I have and solutions I try to solve them.
Tag: time series
Blog
Data Preprocessing II for Salmon Project
So in our Multi-dimensional Modeling and Reconstruction of Signaling Networks in Salmonella-infected Human Cells project, we have several methods to construct the networks so the data is still needed to be preprocessed so that it can be ready to be analyzed with these methods.
One method needed to have a matrix first row as protein name and time series (2 min, 5 min, 10 min, 20 min), and the values of the proteins in each time series were to be 1 or 0 according to variance, significance and the size of fold change.
Tag: ped
Blog
How to Convert PED to FASTA
You may need the conversion of PED files to FASTA format in your studies for further analyses. Use below script for this purpose.
PED to FASTA converter on GitHub
Gets first 6 columns of each line as header line and the rest as the sequence replacing 0s with Ns and organizes it into a FASTA file.
Note 0s are for missing nucleotides defined by default in PLINK
How to run:
Blog
How to Convert PLINK Binary Formats into Non-binary Formats
PLINK is a whole genome association analysis toolset and to save time and space, you need to convert your data files to binary formats (BED, FAM, BIM) but of course when you need to view the files, you have to convert them back to non-binary formats (PED, MAP) to be able to open them in your text editor such as Notepad on Windows OS.
This operation is really easy. It requires PLINK of course, and the following line of code written to DOS window (Run -> type cmd; hit ENTER) in the directory of PLINK:
Tag: ped2fasta
Blog
How to Convert PED to FASTA
You may need the conversion of PED files to FASTA format in your studies for further analyses. Use below script for this purpose.
PED to FASTA converter on GitHub
Gets first 6 columns of each line as header line and the rest as the sequence replacing 0s with Ns and organizes it into a FASTA file.
Note 0s are for missing nucleotides defined by default in PLINK
How to run:
Tag: plink
Blog
How to Convert PED to FASTA
You may need the conversion of PED files to FASTA format in your studies for further analyses. Use below script for this purpose.
PED to FASTA converter on GitHub
Gets first 6 columns of each line as header line and the rest as the sequence replacing 0s with Ns and organizes it into a FASTA file.
Note 0s are for missing nucleotides defined by default in PLINK
How to run:
Blog
How to Convert PLINK Binary Formats into Non-binary Formats
PLINK is a whole genome association analysis toolset and to save time and space, you need to convert your data files to binary formats (BED, FAM, BIM) but of course when you need to view the files, you have to convert them back to non-binary formats (PED, MAP) to be able to open them in your text editor such as Notepad on Windows OS.
This operation is really easy. It requires PLINK of course, and the following line of code written to DOS window (Run -> type cmd; hit ENTER) in the directory of PLINK:
Tag: ipi
Blog
Data Preprocessing I for Salmon Project
Since we’ll be using R for most of the analyses, we converted XLS data file to CSV using MS Office Excel 2013 and then we had to fix several lines using Sublime Text 2 because three colums in these lines were left unquoted which later created a problem reading in RStudio.
The data contains phosphorylation data of 8553 peptides. There are many missing data points for many peptides and since IPI IDs were used for peptides and these are not supported now, we had to convert IPI IDs to HGNC approved symbols although data had these symbols as names but they looked outdated.
Tag: ms office 2013
Blog
Data Preprocessing I for Salmon Project
Since we’ll be using R for most of the analyses, we converted XLS data file to CSV using MS Office Excel 2013 and then we had to fix several lines using Sublime Text 2 because three colums in these lines were left unquoted which later created a problem reading in RStudio.
The data contains phosphorylation data of 8553 peptides. There are many missing data points for many peptides and since IPI IDs were used for peptides and these are not supported now, we had to convert IPI IDs to HGNC approved symbols although data had these symbols as names but they looked outdated.
Tag: rstudio
Blog
Data Preprocessing I for Salmon Project
Since we’ll be using R for most of the analyses, we converted XLS data file to CSV using MS Office Excel 2013 and then we had to fix several lines using Sublime Text 2 because three colums in these lines were left unquoted which later created a problem reading in RStudio.
The data contains phosphorylation data of 8553 peptides. There are many missing data points for many peptides and since IPI IDs were used for peptides and these are not supported now, we had to convert IPI IDs to HGNC approved symbols although data had these symbols as names but they looked outdated.
Tag: sublime text 2
Blog
Data Preprocessing I for Salmon Project
Since we’ll be using R for most of the analyses, we converted XLS data file to CSV using MS Office Excel 2013 and then we had to fix several lines using Sublime Text 2 because three colums in these lines were left unquoted which later created a problem reading in RStudio.
The data contains phosphorylation data of 8553 peptides. There are many missing data points for many peptides and since IPI IDs were used for peptides and these are not supported now, we had to convert IPI IDs to HGNC approved symbols although data had these symbols as names but they looked outdated.
Tag: uniprot
Blog
Data Preprocessing I for Salmon Project
Since we’ll be using R for most of the analyses, we converted XLS data file to CSV using MS Office Excel 2013 and then we had to fix several lines using Sublime Text 2 because three colums in these lines were left unquoted which later created a problem reading in RStudio.
The data contains phosphorylation data of 8553 peptides. There are many missing data points for many peptides and since IPI IDs were used for peptides and these are not supported now, we had to convert IPI IDs to HGNC approved symbols although data had these symbols as names but they looked outdated.
Tag: download human genome
Blog
Download Human Reference Genome (HG19 - GRCh37)
Many variation calling tools and many other methods in bioinformatics require a reference genome as an input so may need to download human reference genome or sequences. There are several sources that freely and publicly provide the entire human genome and I’ll describe how to download complete human genome from University of California, Santa Cruz (UCSC) webpage.
Index to the gzip-compressed FASTA files of human chromosomes can be found here at the UCSC webpage.
Tag: download human reference
Blog
Download Human Reference Genome (HG19 - GRCh37)
Many variation calling tools and many other methods in bioinformatics require a reference genome as an input so may need to download human reference genome or sequences. There are several sources that freely and publicly provide the entire human genome and I’ll describe how to download complete human genome from University of California, Santa Cruz (UCSC) webpage.
Index to the gzip-compressed FASTA files of human chromosomes can be found here at the UCSC webpage.
Tag: grch37
Blog
Download Human Reference Genome (HG19 - GRCh37)
Many variation calling tools and many other methods in bioinformatics require a reference genome as an input so may need to download human reference genome or sequences. There are several sources that freely and publicly provide the entire human genome and I’ll describe how to download complete human genome from University of California, Santa Cruz (UCSC) webpage.
Index to the gzip-compressed FASTA files of human chromosomes can be found here at the UCSC webpage.
Tag: hg19
Blog
Download Human Reference Genome (HG19 - GRCh37)
Many variation calling tools and many other methods in bioinformatics require a reference genome as an input so may need to download human reference genome or sequences. There are several sources that freely and publicly provide the entire human genome and I’ll describe how to download complete human genome from University of California, Santa Cruz (UCSC) webpage.
Index to the gzip-compressed FASTA files of human chromosomes can be found here at the UCSC webpage.
Tag: human reference genome
Blog
Download Human Reference Genome (HG19 - GRCh37)
Many variation calling tools and many other methods in bioinformatics require a reference genome as an input so may need to download human reference genome or sequences. There are several sources that freely and publicly provide the entire human genome and I’ll describe how to download complete human genome from University of California, Santa Cruz (UCSC) webpage.
Index to the gzip-compressed FASTA files of human chromosomes can be found here at the UCSC webpage.
Tag: ucsc
Blog
Download Human Reference Genome (HG19 - GRCh37)
Many variation calling tools and many other methods in bioinformatics require a reference genome as an input so may need to download human reference genome or sequences. There are several sources that freely and publicly provide the entire human genome and I’ll describe how to download complete human genome from University of California, Santa Cruz (UCSC) webpage.
Index to the gzip-compressed FASTA files of human chromosomes can be found here at the UCSC webpage.
Tag: uncompress gz
Blog
Download Human Reference Genome (HG19 - GRCh37)
Many variation calling tools and many other methods in bioinformatics require a reference genome as an input so may need to download human reference genome or sequences. There are several sources that freely and publicly provide the entire human genome and I’ll describe how to download complete human genome from University of California, Santa Cruz (UCSC) webpage.
Index to the gzip-compressed FASTA files of human chromosomes can be found here at the UCSC webpage.
Tag: wget
Blog
Download Human Reference Genome (HG19 - GRCh37)
Many variation calling tools and many other methods in bioinformatics require a reference genome as an input so may need to download human reference genome or sequences. There are several sources that freely and publicly provide the entire human genome and I’ll describe how to download complete human genome from University of California, Santa Cruz (UCSC) webpage.
Index to the gzip-compressed FASTA files of human chromosomes can be found here at the UCSC webpage.
Tag: bwa
Blog
ClipCrop Installation on Linux Mint 16 nvm, Node, npm Included
ClipCrop is a tool for detecting structural variations from SAM files. And it’s built with Node.js.
ClipCrop uses two softwares internally so they should be installed first.
Install SHRiMP2
SHRiMP is a software package for aligning genomic reads against a target genome.
1$ mkdir ~/software 2$ cd ~/software 3$ wget http://compbio.cs.toronto.edu/shrimp/releases/SHRiMP_2_2_3.lx26.x86_64.tar.gz 4$ tar xzvf SHRiMP_2_2_3.lx26.x86_64.tar.gz 5$ cd SHRiMP_2_2_3 6$ file bin/gmapper 7$ export SHRIMP_FOLDER=$PWD Install BWA
BWA is a software package for mapping low-divergent sequences against a large reference genome.
Blog
Eşleştirme ve Eşleşmeyen Okumaları Çıkarma Sonuçları
Daha önce verinin sadece bir kısmı ile çalışıyordum ancak artık tamamıyla çalışacağım. Bu yüzden bana sıkıştırılmış halde gelen veriyi direkt çalışma klasörüme çıkardım ve onun üzerinden işlemler yaptım.
Başlangıç (FASTQ) dosyamın boyutu 2153988289 bayt (2 GB). Ve bwa aracılığıyla eşleştirmeden sonra toplamda 6004193 dizilim, ya da okuma, (sequences ya da reads) ortaya çıktı. Daha sonra eşleşmeyen okumaları çıkarmam sonrasında toplam okuma sayısı 551065 kadar azaldı ve 5493128 oldu. Yani verinin %9.
Blog
BWA İle Eşleştirme (Mapping - Alignment)
Bunu daha önce yazmayı unutmuşum. Aslında bahsetmiştim ancak nasıl yapıldığına dair bir şeyler yazmamışım ayrıca örnek komutlar da eklememişim.
BWA elimizdeki (FASTQ formatındaki) DNA dizilimini, referans genomunu (projemde bu insan genomu) alarak bir .sai dosyası oluşturuyor. Bu dosya dizinin ve referans genomunun eşleşmesi ile ilgili bilgiler taşiyor ve bu bilgileri kullanarak eşleşmeyenleri ayırabiliyorum.
İlk olarak aşağıdaki komut ile .sai dosyamızı oluşturuyoruz.
1bwa aln $NGSDATAROOT/bwa/human_genome37 ChIP_NoIndex_L001_R1_complete_filtered.fastq > complete_alignment.sai Oluşturduğumuz .sai dosyası çok da kullanışlı bir dosya değil, bu yüzden onu SAM dosyasına çevirerek, işlemlere devam ediyoruz.
Blog
SAM Dosyası - BAM Dosyası - samtools
Aslında programlamam gereken pipeline direkt olarak eşleşmeyen okumalar üzerinden analizler yapacak. Ancak böyle bir veri bulamadiığım için, elimdeki tek veri eşleşen ve eşleşmeyen okumaları içerdiği için önce eşleşenlerden kurtulmam gerekti.
Bunu daha önce de belirttiğim gibi bwa eşleştiricisi (aligner - mapper) ile yapıyorum. bwa bir dizi işlemden sonra SAM dosyası oluşturuyor ancak benim FASTQ dosyasına ihtiyacım var. Bunun için SAM dosyasını samtools1 ile benzer bir format olan BAM dosyasına çevirip, daha sonra da bam2fastq2 aracı ile FASTQ dosyamı elde edeceğim.
Blog
FASTQ Formatı - FASTQ Dosyası
Bugün programı oluştururken kullanacağım “test” dizilimini aldım. İki adet FASTQ dosyasından oluşuyor, her biri sıkıştırılmış ama buna rağmen boyutları 6 GB civarı. Ben elbette çok zaman kaybetmek istemediğim için bu dosyalardan birinin sadece bir kısmını kullanacağım.
Amacım, bu FASTQ dosyalarındaki eşleşebilen okumaları BWA aracı ile bularak, daha sonra onları çıkarmak. Ve kalan eşleşemeyen okumaları MegaBLAST aracının anlayabileceği bir dilde (FASTA formatında) kaydetmek.
Bu arada tüm projeyi bir Unix bilgisayarda hazırladığım için birçok komut öğreniyorum, daha sonra bunları ayrıca yazmaya çalışacağım.
Blog
BWA (Burrows-Wheeler Aligner) Hizalayıcı - Eşleştirici
Önceki yazımda belirttiğim gibi bir eşleştirici (aligner ya da mapper) kullanarak elimdeki verinin referans genomu ile ne derece eşlestiğini bulmaya çalışacağım. Daha sonra eşleşmeyen kısmıyla birtakım analizler yapacağım.
BWA (Burrows-Wheeler Aligner) görece kısa dizilimleri insan genomu gibi uzun referans genomlarıyla eşleştiren bir program. 200bp (bp: baz çifti) uzunluğuna kadar bwa-short algoritması, 200bp - 100kbp arası ise BWA-SW algoritması kullanılıyor.
Hizalayıcı - eşleştirici seçmede birçok faktör rol oynuyor. Birçok bu tip araç var ve farklı özelliklere sahipler.
Tag: clipcrop
Blog
ClipCrop Installation on Linux Mint 16 nvm, Node, npm Included
ClipCrop is a tool for detecting structural variations from SAM files. And it’s built with Node.js.
ClipCrop uses two softwares internally so they should be installed first.
Install SHRiMP2
SHRiMP is a software package for aligning genomic reads against a target genome.
1$ mkdir ~/software 2$ cd ~/software 3$ wget http://compbio.cs.toronto.edu/shrimp/releases/SHRiMP_2_2_3.lx26.x86_64.tar.gz 4$ tar xzvf SHRiMP_2_2_3.lx26.x86_64.tar.gz 5$ cd SHRiMP_2_2_3 6$ file bin/gmapper 7$ export SHRIMP_FOLDER=$PWD Install BWA
BWA is a software package for mapping low-divergent sequences against a large reference genome.
Tag: install clipcrop
Blog
ClipCrop Installation on Linux Mint 16 nvm, Node, npm Included
ClipCrop is a tool for detecting structural variations from SAM files. And it’s built with Node.js.
ClipCrop uses two softwares internally so they should be installed first.
Install SHRiMP2
SHRiMP is a software package for aligning genomic reads against a target genome.
1$ mkdir ~/software 2$ cd ~/software 3$ wget http://compbio.cs.toronto.edu/shrimp/releases/SHRiMP_2_2_3.lx26.x86_64.tar.gz 4$ tar xzvf SHRiMP_2_2_3.lx26.x86_64.tar.gz 5$ cd SHRiMP_2_2_3 6$ file bin/gmapper 7$ export SHRIMP_FOLDER=$PWD Install BWA
BWA is a software package for mapping low-divergent sequences against a large reference genome.
Tag: install nodejs
Blog
ClipCrop Installation on Linux Mint 16 nvm, Node, npm Included
ClipCrop is a tool for detecting structural variations from SAM files. And it’s built with Node.js.
ClipCrop uses two softwares internally so they should be installed first.
Install SHRiMP2
SHRiMP is a software package for aligning genomic reads against a target genome.
1$ mkdir ~/software 2$ cd ~/software 3$ wget http://compbio.cs.toronto.edu/shrimp/releases/SHRiMP_2_2_3.lx26.x86_64.tar.gz 4$ tar xzvf SHRiMP_2_2_3.lx26.x86_64.tar.gz 5$ cd SHRiMP_2_2_3 6$ file bin/gmapper 7$ export SHRIMP_FOLDER=$PWD Install BWA
BWA is a software package for mapping low-divergent sequences against a large reference genome.
Tag: node
Blog
ClipCrop Installation on Linux Mint 16 nvm, Node, npm Included
ClipCrop is a tool for detecting structural variations from SAM files. And it’s built with Node.js.
ClipCrop uses two softwares internally so they should be installed first.
Install SHRiMP2
SHRiMP is a software package for aligning genomic reads against a target genome.
1$ mkdir ~/software 2$ cd ~/software 3$ wget http://compbio.cs.toronto.edu/shrimp/releases/SHRiMP_2_2_3.lx26.x86_64.tar.gz 4$ tar xzvf SHRiMP_2_2_3.lx26.x86_64.tar.gz 5$ cd SHRiMP_2_2_3 6$ file bin/gmapper 7$ export SHRIMP_FOLDER=$PWD Install BWA
BWA is a software package for mapping low-divergent sequences against a large reference genome.
Tag: nodejs
Blog
ClipCrop Installation on Linux Mint 16 nvm, Node, npm Included
ClipCrop is a tool for detecting structural variations from SAM files. And it’s built with Node.js.
ClipCrop uses two softwares internally so they should be installed first.
Install SHRiMP2
SHRiMP is a software package for aligning genomic reads against a target genome.
1$ mkdir ~/software 2$ cd ~/software 3$ wget http://compbio.cs.toronto.edu/shrimp/releases/SHRiMP_2_2_3.lx26.x86_64.tar.gz 4$ tar xzvf SHRiMP_2_2_3.lx26.x86_64.tar.gz 5$ cd SHRiMP_2_2_3 6$ file bin/gmapper 7$ export SHRIMP_FOLDER=$PWD Install BWA
BWA is a software package for mapping low-divergent sequences against a large reference genome.
Tag: npm
Blog
ClipCrop Installation on Linux Mint 16 nvm, Node, npm Included
ClipCrop is a tool for detecting structural variations from SAM files. And it’s built with Node.js.
ClipCrop uses two softwares internally so they should be installed first.
Install SHRiMP2
SHRiMP is a software package for aligning genomic reads against a target genome.
1$ mkdir ~/software 2$ cd ~/software 3$ wget http://compbio.cs.toronto.edu/shrimp/releases/SHRiMP_2_2_3.lx26.x86_64.tar.gz 4$ tar xzvf SHRiMP_2_2_3.lx26.x86_64.tar.gz 5$ cd SHRiMP_2_2_3 6$ file bin/gmapper 7$ export SHRIMP_FOLDER=$PWD Install BWA
BWA is a software package for mapping low-divergent sequences against a large reference genome.
Tag: nvm
Blog
ClipCrop Installation on Linux Mint 16 nvm, Node, npm Included
ClipCrop is a tool for detecting structural variations from SAM files. And it’s built with Node.js.
ClipCrop uses two softwares internally so they should be installed first.
Install SHRiMP2
SHRiMP is a software package for aligning genomic reads against a target genome.
1$ mkdir ~/software 2$ cd ~/software 3$ wget http://compbio.cs.toronto.edu/shrimp/releases/SHRiMP_2_2_3.lx26.x86_64.tar.gz 4$ tar xzvf SHRiMP_2_2_3.lx26.x86_64.tar.gz 5$ cd SHRiMP_2_2_3 6$ file bin/gmapper 7$ export SHRIMP_FOLDER=$PWD Install BWA
BWA is a software package for mapping low-divergent sequences against a large reference genome.
Tag: sam
Blog
ClipCrop Installation on Linux Mint 16 nvm, Node, npm Included
ClipCrop is a tool for detecting structural variations from SAM files. And it’s built with Node.js.
ClipCrop uses two softwares internally so they should be installed first.
Install SHRiMP2
SHRiMP is a software package for aligning genomic reads against a target genome.
1$ mkdir ~/software 2$ cd ~/software 3$ wget http://compbio.cs.toronto.edu/shrimp/releases/SHRiMP_2_2_3.lx26.x86_64.tar.gz 4$ tar xzvf SHRiMP_2_2_3.lx26.x86_64.tar.gz 5$ cd SHRiMP_2_2_3 6$ file bin/gmapper 7$ export SHRIMP_FOLDER=$PWD Install BWA
BWA is a software package for mapping low-divergent sequences against a large reference genome.
Blog
SAM Dosyası - BAM Dosyası - samtools
Aslında programlamam gereken pipeline direkt olarak eşleşmeyen okumalar üzerinden analizler yapacak. Ancak böyle bir veri bulamadiığım için, elimdeki tek veri eşleşen ve eşleşmeyen okumaları içerdiği için önce eşleşenlerden kurtulmam gerekti.
Bunu daha önce de belirttiğim gibi bwa eşleştiricisi (aligner - mapper) ile yapıyorum. bwa bir dizi işlemden sonra SAM dosyası oluşturuyor ancak benim FASTQ dosyasına ihtiyacım var. Bunun için SAM dosyasını samtools1 ile benzer bir format olan BAM dosyasına çevirip, daha sonra da bam2fastq2 aracı ile FASTQ dosyamı elde edeceğim.
Tag: shrimp2
Blog
ClipCrop Installation on Linux Mint 16 nvm, Node, npm Included
ClipCrop is a tool for detecting structural variations from SAM files. And it’s built with Node.js.
ClipCrop uses two softwares internally so they should be installed first.
Install SHRiMP2
SHRiMP is a software package for aligning genomic reads against a target genome.
1$ mkdir ~/software 2$ cd ~/software 3$ wget http://compbio.cs.toronto.edu/shrimp/releases/SHRiMP_2_2_3.lx26.x86_64.tar.gz 4$ tar xzvf SHRiMP_2_2_3.lx26.x86_64.tar.gz 5$ cd SHRiMP_2_2_3 6$ file bin/gmapper 7$ export SHRIMP_FOLDER=$PWD Install BWA
BWA is a software package for mapping low-divergent sequences against a large reference genome.
Tag: cython
Blog
JointSNVMix Installation on Linux Mint 16 Cython, Pysam Included
JointSNVMix is a software package that consists of a number of tools for calling somatic mutations in tumour/normal paired NGS data.
It requires Python (>= 2.7), Cython (>= 0.13) and Pysam (== 0.5.0).
Python must be installed by default ona Linux machine so I will describe the installation of others and JointSNVMix.
Note this guide may become outdated after some time so please make sure before following all.
Install Cython
Tag: distribute
Blog
JointSNVMix Installation on Linux Mint 16 Cython, Pysam Included
JointSNVMix is a software package that consists of a number of tools for calling somatic mutations in tumour/normal paired NGS data.
It requires Python (>= 2.7), Cython (>= 0.13) and Pysam (== 0.5.0).
Python must be installed by default ona Linux machine so I will describe the installation of others and JointSNVMix.
Note this guide may become outdated after some time so please make sure before following all.
Install Cython
Tag: ez_setup
Blog
JointSNVMix Installation on Linux Mint 16 Cython, Pysam Included
JointSNVMix is a software package that consists of a number of tools for calling somatic mutations in tumour/normal paired NGS data.
It requires Python (>= 2.7), Cython (>= 0.13) and Pysam (== 0.5.0).
Python must be installed by default ona Linux machine so I will describe the installation of others and JointSNVMix.
Note this guide may become outdated after some time so please make sure before following all.
Install Cython
Tag: jointsnvmix
Blog
JointSNVMix Installation on Linux Mint 16 Cython, Pysam Included
JointSNVMix is a software package that consists of a number of tools for calling somatic mutations in tumour/normal paired NGS data.
It requires Python (>= 2.7), Cython (>= 0.13) and Pysam (== 0.5.0).
Python must be installed by default ona Linux machine so I will describe the installation of others and JointSNVMix.
Note this guide may become outdated after some time so please make sure before following all.
Install Cython
Tag: pysam
Blog
JointSNVMix Installation on Linux Mint 16 Cython, Pysam Included
JointSNVMix is a software package that consists of a number of tools for calling somatic mutations in tumour/normal paired NGS data.
It requires Python (>= 2.7), Cython (>= 0.13) and Pysam (== 0.5.0).
Python must be installed by default ona Linux machine so I will describe the installation of others and JointSNVMix.
Note this guide may become outdated after some time so please make sure before following all.
Install Cython
Tag: curl
Blog
Set Up Google Cloud SDK on Windows using Cygwin
Windows isn’t the best environment for software development I believe but if you have to use it there are nice softwares to make it easy for you. Cygwin here will help us to use Google Cloud tools but installation requires certain things that you should be aware of beforehand.
You’ll need
Python latest 2.7.x Google Cloud SDK Cygwin 32-bit (i.e. setup-x86.exe - note only this one works) openssh, curl and latest 2.
Tag: cygwin
Blog
Set Up Google Cloud SDK on Windows using Cygwin
Windows isn’t the best environment for software development I believe but if you have to use it there are nice softwares to make it easy for you. Cygwin here will help us to use Google Cloud tools but installation requires certain things that you should be aware of beforehand.
You’ll need
Python latest 2.7.x Google Cloud SDK Cygwin 32-bit (i.e. setup-x86.exe - note only this one works) openssh, curl and latest 2.
Tag: cygwin 32-bit
Blog
Set Up Google Cloud SDK on Windows using Cygwin
Windows isn’t the best environment for software development I believe but if you have to use it there are nice softwares to make it easy for you. Cygwin here will help us to use Google Cloud tools but installation requires certain things that you should be aware of beforehand.
You’ll need
Python latest 2.7.x Google Cloud SDK Cygwin 32-bit (i.e. setup-x86.exe - note only this one works) openssh, curl and latest 2.
Tag: google cloud
Blog
Set Up Google Cloud SDK on Windows using Cygwin
Windows isn’t the best environment for software development I believe but if you have to use it there are nice softwares to make it easy for you. Cygwin here will help us to use Google Cloud tools but installation requires certain things that you should be aware of beforehand.
You’ll need
Python latest 2.7.x Google Cloud SDK Cygwin 32-bit (i.e. setup-x86.exe - note only this one works) openssh, curl and latest 2.
Tag: google cloud sdk
Blog
Set Up Google Cloud SDK on Windows using Cygwin
Windows isn’t the best environment for software development I believe but if you have to use it there are nice softwares to make it easy for you. Cygwin here will help us to use Google Cloud tools but installation requires certain things that you should be aware of beforehand.
You’ll need
Python latest 2.7.x Google Cloud SDK Cygwin 32-bit (i.e. setup-x86.exe - note only this one works) openssh, curl and latest 2.
Tag: openssh
Blog
Set Up Google Cloud SDK on Windows using Cygwin
Windows isn’t the best environment for software development I believe but if you have to use it there are nice softwares to make it easy for you. Cygwin here will help us to use Google Cloud tools but installation requires certain things that you should be aware of beforehand.
You’ll need
Python latest 2.7.x Google Cloud SDK Cygwin 32-bit (i.e. setup-x86.exe - note only this one works) openssh, curl and latest 2.
Tag: biotype
Blog
Super Long Introns of Euarchontoglires
There was another weird result I got about my exon/intron boundaries analysis research. To less diverse species’ genes, intron lengths are shown to increase. However, according to my findings, at a point of Euarchontoglires or Supraprimates, this increase is very sharp and seems unexpected. So, I looked at exon/intron length each gene in each taxonomic rank and try to see what makes Euarchontoglires genes with that long introns.
As you see in the graph above, Euarchontoglires introns are very long compared to the rest.
Tag: ensembl
Blog
Super Long Introns of Euarchontoglires
There was another weird result I got about my exon/intron boundaries analysis research. To less diverse species’ genes, intron lengths are shown to increase. However, according to my findings, at a point of Euarchontoglires or Supraprimates, this increase is very sharp and seems unexpected. So, I looked at exon/intron length each gene in each taxonomic rank and try to see what makes Euarchontoglires genes with that long introns.
As you see in the graph above, Euarchontoglires introns are very long compared to the rest.
Blog
An Exon of Length 2 Appeared in Ensembl
I want to share an interesting finding about our research on exon/intron analysis of human evolutionary history.
So I had the genes that emerged at each pass point of human history and I was using Ensembl API to get exons and introns of these genes to perform further analyses.
There was one gene (ENSG00000197568 - HERV-H LTR-associating 3 - HHLA3) with a surprise. Because it’s one transcript (ENST00000432224) had an exon (ENSE00001707577) of length 2.
Blog
How to Get Transcripts (also Exons & Introns) of a Gene using Ensembl API
As a part of my project, I need to obtain exons and introns of certain genes. These genes are actually human genes that are determined for a specific reason that I will describe later when I explain my project. But for now, I want to share the way to obtain this information using (Perl) Ensembl API. Note that Ensembl has started a beautiful way (Ensembl REST API) of getting data but it is beta and it doesn’t provide exons / introns information.
Blog
Install Ensembl API and BioPerl 1.2.3 on Your System
I’m going to work on a project that requires lots of queries on Ensembl databases so I wanted to install Ensembl API to begin with. Since it’s programmed in Perl, I will be using Perl in this project.
There is a nice tutorial on Ensembl website for API installation. Here I will describe some steps.
1. Download the API and BioPerl
Go to Ensembl FTP ftp://ftp.ensembl.org/pub/ and download “ensembl-api.tar.gz” or click here
Tag: ensembl api
Blog
Super Long Introns of Euarchontoglires
There was another weird result I got about my exon/intron boundaries analysis research. To less diverse species’ genes, intron lengths are shown to increase. However, according to my findings, at a point of Euarchontoglires or Supraprimates, this increase is very sharp and seems unexpected. So, I looked at exon/intron length each gene in each taxonomic rank and try to see what makes Euarchontoglires genes with that long introns.
As you see in the graph above, Euarchontoglires introns are very long compared to the rest.
Blog
How to Get Transcripts (also Exons & Introns) of a Gene using Ensembl API
As a part of my project, I need to obtain exons and introns of certain genes. These genes are actually human genes that are determined for a specific reason that I will describe later when I explain my project. But for now, I want to share the way to obtain this information using (Perl) Ensembl API. Note that Ensembl has started a beautiful way (Ensembl REST API) of getting data but it is beta and it doesn’t provide exons / introns information.
Blog
Install Ensembl API and BioPerl 1.2.3 on Your System
I’m going to work on a project that requires lots of queries on Ensembl databases so I wanted to install Ensembl API to begin with. Since it’s programmed in Perl, I will be using Perl in this project.
There is a nice tutorial on Ensembl website for API installation. Here I will describe some steps.
1. Download the API and BioPerl
Go to Ensembl FTP ftp://ftp.ensembl.org/pub/ and download “ensembl-api.tar.gz” or click here
Tag: euarchontoglires
Blog
Super Long Introns of Euarchontoglires
There was another weird result I got about my exon/intron boundaries analysis research. To less diverse species’ genes, intron lengths are shown to increase. However, according to my findings, at a point of Euarchontoglires or Supraprimates, this increase is very sharp and seems unexpected. So, I looked at exon/intron length each gene in each taxonomic rank and try to see what makes Euarchontoglires genes with that long introns.
As you see in the graph above, Euarchontoglires introns are very long compared to the rest.
Tag: exon
Blog
Super Long Introns of Euarchontoglires
There was another weird result I got about my exon/intron boundaries analysis research. To less diverse species’ genes, intron lengths are shown to increase. However, according to my findings, at a point of Euarchontoglires or Supraprimates, this increase is very sharp and seems unexpected. So, I looked at exon/intron length each gene in each taxonomic rank and try to see what makes Euarchontoglires genes with that long introns.
As you see in the graph above, Euarchontoglires introns are very long compared to the rest.
Blog
How to Get Transcripts (also Exons & Introns) of a Gene using Ensembl API
As a part of my project, I need to obtain exons and introns of certain genes. These genes are actually human genes that are determined for a specific reason that I will describe later when I explain my project. But for now, I want to share the way to obtain this information using (Perl) Ensembl API. Note that Ensembl has started a beautiful way (Ensembl REST API) of getting data but it is beta and it doesn’t provide exons / introns information.
Tag: exon and intron boundaries
Blog
Super Long Introns of Euarchontoglires
There was another weird result I got about my exon/intron boundaries analysis research. To less diverse species’ genes, intron lengths are shown to increase. However, according to my findings, at a point of Euarchontoglires or Supraprimates, this increase is very sharp and seems unexpected. So, I looked at exon/intron length each gene in each taxonomic rank and try to see what makes Euarchontoglires genes with that long introns.
As you see in the graph above, Euarchontoglires introns are very long compared to the rest.
Blog
How to Get Transcripts (also Exons & Introns) of a Gene using Ensembl API
As a part of my project, I need to obtain exons and introns of certain genes. These genes are actually human genes that are determined for a specific reason that I will describe later when I explain my project. But for now, I want to share the way to obtain this information using (Perl) Ensembl API. Note that Ensembl has started a beautiful way (Ensembl REST API) of getting data but it is beta and it doesn’t provide exons / introns information.
Tag: homo sapiens
Blog
Super Long Introns of Euarchontoglires
There was another weird result I got about my exon/intron boundaries analysis research. To less diverse species’ genes, intron lengths are shown to increase. However, according to my findings, at a point of Euarchontoglires or Supraprimates, this increase is very sharp and seems unexpected. So, I looked at exon/intron length each gene in each taxonomic rank and try to see what makes Euarchontoglires genes with that long introns.
As you see in the graph above, Euarchontoglires introns are very long compared to the rest.
Blog
Ikinci Veriseti Inceleme Sonuclari
Daha az eslenemeyen okumalara sahip ikinci verisetinin incelemesini tamamladim. Bu oncekine gore daha iyi bir dizileme ornegi oldugu icin aldigim sonuclar da oldukca tutarliydi. Insan genomuna ait bir diziden inceleme sonra asagidaki sonuclari elde ettim.
LIST OF ORGANISMS AND THEIR NUMBER OF OCCURENCES Ambiguous hit 1323 Homo sapiens 312 Pan troglodytes 25 Pongo abelii 18 Nomascus leucogenys 17 Halomonas sp. GFAJ-1 7 Callithrix jacchus 4 Macaca mulatta 3 Oryctolagus cuniculus 2 Loxodonta africana 1 Cavia porcellus 1 “Ambiguous hit” tanimini baska bir yazida aciklayacagim.
Tag: intron
Blog
Super Long Introns of Euarchontoglires
There was another weird result I got about my exon/intron boundaries analysis research. To less diverse species’ genes, intron lengths are shown to increase. However, according to my findings, at a point of Euarchontoglires or Supraprimates, this increase is very sharp and seems unexpected. So, I looked at exon/intron length each gene in each taxonomic rank and try to see what makes Euarchontoglires genes with that long introns.
As you see in the graph above, Euarchontoglires introns are very long compared to the rest.
Blog
How to Get Transcripts (also Exons & Introns) of a Gene using Ensembl API
As a part of my project, I need to obtain exons and introns of certain genes. These genes are actually human genes that are determined for a specific reason that I will describe later when I explain my project. But for now, I want to share the way to obtain this information using (Perl) Ensembl API. Note that Ensembl has started a beautiful way (Ensembl REST API) of getting data but it is beta and it doesn’t provide exons / introns information.
Tag: intron length
Blog
Super Long Introns of Euarchontoglires
There was another weird result I got about my exon/intron boundaries analysis research. To less diverse species’ genes, intron lengths are shown to increase. However, according to my findings, at a point of Euarchontoglires or Supraprimates, this increase is very sharp and seems unexpected. So, I looked at exon/intron length each gene in each taxonomic rank and try to see what makes Euarchontoglires genes with that long introns.
As you see in the graph above, Euarchontoglires introns are very long compared to the rest.
Tag: protein coding
Blog
Super Long Introns of Euarchontoglires
There was another weird result I got about my exon/intron boundaries analysis research. To less diverse species’ genes, intron lengths are shown to increase. However, according to my findings, at a point of Euarchontoglires or Supraprimates, this increase is very sharp and seems unexpected. So, I looked at exon/intron length each gene in each taxonomic rank and try to see what makes Euarchontoglires genes with that long introns.
As you see in the graph above, Euarchontoglires introns are very long compared to the rest.
Tag: databases
Blog
An Exon of Length 2 Appeared in Ensembl
I want to share an interesting finding about our research on exon/intron analysis of human evolutionary history.
So I had the genes that emerged at each pass point of human history and I was using Ensembl API to get exons and introns of these genes to perform further analyses.
There was one gene (ENSG00000197568 - HERV-H LTR-associating 3 - HHLA3) with a surprise. Because it’s one transcript (ENST00000432224) had an exon (ENSE00001707577) of length 2.
Tag: eiban
Blog
An Exon of Length 2 Appeared in Ensembl
I want to share an interesting finding about our research on exon/intron analysis of human evolutionary history.
So I had the genes that emerged at each pass point of human history and I was using Ensembl API to get exons and introns of these genes to perform further analyses.
There was one gene (ENSG00000197568 - HERV-H LTR-associating 3 - HHLA3) with a surprise. Because it’s one transcript (ENST00000432224) had an exon (ENSE00001707577) of length 2.
Tag: exons
Blog
An Exon of Length 2 Appeared in Ensembl
I want to share an interesting finding about our research on exon/intron analysis of human evolutionary history.
So I had the genes that emerged at each pass point of human history and I was using Ensembl API to get exons and introns of these genes to perform further analyses.
There was one gene (ENSG00000197568 - HERV-H LTR-associating 3 - HHLA3) with a surprise. Because it’s one transcript (ENST00000432224) had an exon (ENSE00001707577) of length 2.
Tag: introns
Blog
An Exon of Length 2 Appeared in Ensembl
I want to share an interesting finding about our research on exon/intron analysis of human evolutionary history.
So I had the genes that emerged at each pass point of human history and I was using Ensembl API to get exons and introns of these genes to perform further analyses.
There was one gene (ENSG00000197568 - HERV-H LTR-associating 3 - HHLA3) with a surprise. Because it’s one transcript (ENST00000432224) had an exon (ENSE00001707577) of length 2.
Tag: bed
Blog
How to Convert PLINK Binary Formats into Non-binary Formats
PLINK is a whole genome association analysis toolset and to save time and space, you need to convert your data files to binary formats (BED, FAM, BIM) but of course when you need to view the files, you have to convert them back to non-binary formats (PED, MAP) to be able to open them in your text editor such as Notepad on Windows OS.
This operation is really easy. It requires PLINK of course, and the following line of code written to DOS window (Run -> type cmd; hit ENTER) in the directory of PLINK:
Tag: bim
Blog
How to Convert PLINK Binary Formats into Non-binary Formats
PLINK is a whole genome association analysis toolset and to save time and space, you need to convert your data files to binary formats (BED, FAM, BIM) but of course when you need to view the files, you have to convert them back to non-binary formats (PED, MAP) to be able to open them in your text editor such as Notepad on Windows OS.
This operation is really easy. It requires PLINK of course, and the following line of code written to DOS window (Run -> type cmd; hit ENTER) in the directory of PLINK:
Tag: binary
Blog
How to Convert PLINK Binary Formats into Non-binary Formats
PLINK is a whole genome association analysis toolset and to save time and space, you need to convert your data files to binary formats (BED, FAM, BIM) but of course when you need to view the files, you have to convert them back to non-binary formats (PED, MAP) to be able to open them in your text editor such as Notepad on Windows OS.
This operation is really easy. It requires PLINK of course, and the following line of code written to DOS window (Run -> type cmd; hit ENTER) in the directory of PLINK:
Tag: fam
Blog
How to Convert PLINK Binary Formats into Non-binary Formats
PLINK is a whole genome association analysis toolset and to save time and space, you need to convert your data files to binary formats (BED, FAM, BIM) but of course when you need to view the files, you have to convert them back to non-binary formats (PED, MAP) to be able to open them in your text editor such as Notepad on Windows OS.
This operation is really easy. It requires PLINK of course, and the following line of code written to DOS window (Run -> type cmd; hit ENTER) in the directory of PLINK:
Tag: format conversion
Blog
How to Convert PLINK Binary Formats into Non-binary Formats
PLINK is a whole genome association analysis toolset and to save time and space, you need to convert your data files to binary formats (BED, FAM, BIM) but of course when you need to view the files, you have to convert them back to non-binary formats (PED, MAP) to be able to open them in your text editor such as Notepad on Windows OS.
This operation is really easy. It requires PLINK of course, and the following line of code written to DOS window (Run -> type cmd; hit ENTER) in the directory of PLINK:
Tag: map
Blog
How to Convert PLINK Binary Formats into Non-binary Formats
PLINK is a whole genome association analysis toolset and to save time and space, you need to convert your data files to binary formats (BED, FAM, BIM) but of course when you need to view the files, you have to convert them back to non-binary formats (PED, MAP) to be able to open them in your text editor such as Notepad on Windows OS.
This operation is really easy. It requires PLINK of course, and the following line of code written to DOS window (Run -> type cmd; hit ENTER) in the directory of PLINK:
Tag: ms-dos
Blog
How to Convert PLINK Binary Formats into Non-binary Formats
PLINK is a whole genome association analysis toolset and to save time and space, you need to convert your data files to binary formats (BED, FAM, BIM) but of course when you need to view the files, you have to convert them back to non-binary formats (PED, MAP) to be able to open them in your text editor such as Notepad on Windows OS.
This operation is really easy. It requires PLINK of course, and the following line of code written to DOS window (Run -> type cmd; hit ENTER) in the directory of PLINK:
Tag: non-binary
Blog
How to Convert PLINK Binary Formats into Non-binary Formats
PLINK is a whole genome association analysis toolset and to save time and space, you need to convert your data files to binary formats (BED, FAM, BIM) but of course when you need to view the files, you have to convert them back to non-binary formats (PED, MAP) to be able to open them in your text editor such as Notepad on Windows OS.
This operation is really easy. It requires PLINK of course, and the following line of code written to DOS window (Run -> type cmd; hit ENTER) in the directory of PLINK:
Tag: gene
Blog
How to Get Transcripts (also Exons & Introns) of a Gene using Ensembl API
As a part of my project, I need to obtain exons and introns of certain genes. These genes are actually human genes that are determined for a specific reason that I will describe later when I explain my project. But for now, I want to share the way to obtain this information using (Perl) Ensembl API. Note that Ensembl has started a beautiful way (Ensembl REST API) of getting data but it is beta and it doesn’t provide exons / introns information.
Blog
Retrieving Data with AJAX using jQuery, PHP and MySQL
Last semester, I took a course from Informatics Institute at METU called “Biological Databases and Data Analysis Tools” where first we learned what is a database and how to do queries on it. Also, the technology behind databases are taught. Then, we learned many biological databases and data analysis tools available. These include gene, protein and pathway databases, tools for creating databases.
As a final project, we were asked to create an online tool that can search a database and get the data and display it on any web browsers.
Tag: github
Blog
How to Get Transcripts (also Exons & Introns) of a Gene using Ensembl API
As a part of my project, I need to obtain exons and introns of certain genes. These genes are actually human genes that are determined for a specific reason that I will describe later when I explain my project. But for now, I want to share the way to obtain this information using (Perl) Ensembl API. Note that Ensembl has started a beautiful way (Ensembl REST API) of getting data but it is beta and it doesn’t provide exons / introns information.
Tag: transcript
Blog
How to Get Transcripts (also Exons & Introns) of a Gene using Ensembl API
As a part of my project, I need to obtain exons and introns of certain genes. These genes are actually human genes that are determined for a specific reason that I will describe later when I explain my project. But for now, I want to share the way to obtain this information using (Perl) Ensembl API. Note that Ensembl has started a beautiful way (Ensembl REST API) of getting data but it is beta and it doesn’t provide exons / introns information.
Tag: geany
Blog
Geany Color Schemes Ubuntu
There is a collection of color schemes for Geany as well.
Download it on GitHub and follow the instructions.
You’ll need to extract and copy all the files in colorschemes directory to ~/.config/geany/colorschemes/
Then, restart Geany and go to View -> Editor -> Color Schemes and choose your style.
I’m using Tango.
Source
Blog
Install Geany 1.23 on Ubuntu
Geany is a really nice text editor for Ubuntu. I would recommend it with TreeBrowser and some interface coding are color schemes.
But you’ll need the latest version which is 1.23 for now.
To install this version you need to add PPA, also this will keep it updated when you update your system.
Execute following lines one by one:
sudo add-apt-repository ppa:geany-dev/ppa sudo apt-get update sudo apt-get install geany Then, when you start Geany you’ll see “This is Geany 1.
Blog
A Nice File Browser for Geany 1.23 on Ubuntu 12.04 LTS
If you’re looking for a file browser for Geany, check out TreeBrowser plugin on its page (see the page for screenshots).
To install and enable, just run following o Terminal:
sudo apt-get install geany-plugin-treebrowser And go to “Tools” -> “Plugin Manager”, check “TreeBrowser”
Source
Tag: text editor
Blog
Geany Color Schemes Ubuntu
There is a collection of color schemes for Geany as well.
Download it on GitHub and follow the instructions.
You’ll need to extract and copy all the files in colorschemes directory to ~/.config/geany/colorschemes/
Then, restart Geany and go to View -> Editor -> Color Schemes and choose your style.
I’m using Tango.
Source
Blog
Install Geany 1.23 on Ubuntu
Geany is a really nice text editor for Ubuntu. I would recommend it with TreeBrowser and some interface coding are color schemes.
But you’ll need the latest version which is 1.23 for now.
To install this version you need to add PPA, also this will keep it updated when you update your system.
Execute following lines one by one:
sudo add-apt-repository ppa:geany-dev/ppa sudo apt-get update sudo apt-get install geany Then, when you start Geany you’ll see “This is Geany 1.
Blog
A Nice File Browser for Geany 1.23 on Ubuntu 12.04 LTS
If you’re looking for a file browser for Geany, check out TreeBrowser plugin on its page (see the page for screenshots).
To install and enable, just run following o Terminal:
sudo apt-get install geany-plugin-treebrowser And go to “Tools” -> “Plugin Manager”, check “TreeBrowser”
Source
Tag: php
Blog
Install Apache2, PHP5, MySQL & phpMyAdmin on Ubuntu 12.04
First, install apache2:
sudo apt-get install apache2 Then, for it to work: sudo service apache2 restart
For custom www folder:
sudo cp /etc/apache2/sites-available/default /etc/apache2/sites-available/www gksudo gedit /etc/apache2/sites-available/www Change DocumentRoot and Directory directive to point to new location. For example, /home/user/www/
Save and see (link here clean URLs not working Laravel 4)
Make www default and disable default:
sudo a2dissite default && sudo a2ensite www sudo service apache2 restart Create new file in www
Blog
session_start() Permission denied (13) Laravel 4
Solve it by running following lines:
chmod -R 755 /path/to/your/laravel/directory chmod -R o+w /path/to/your/laravel/directory And/or maybe:
sudo chown -R www-data:user /path/to/your/laravel/directory
Blog
If clean URLs don't work in Laravel 4 on Ubuntu 12.04 LTS
.htaccess directions are correct, mod_rewrite is enabled but still you are getting 404 Not Found errors…
You need to change AllowOverride None to AllowOverride All in /etc/apache2/sites-available/default.
Modified section in the file:
<Directory /home/user/www/> Options Indexes FollowSymLinks MultiViews AllowOverride All Order allow,deny allow from all </Directory>
Blog
Permission Issues develop Laravel 4 on Ubuntu 12.04 LTS
If your CSS or JS files don’t seem to load or you get 403 Forbidden or Permissions denied, all you need to do is to run following on terminal:
sudo chmod -R 755 /path/to/your/laravel/directory
Blog
Base URL for Your Laravel 4 Website
To get base URL of your website to generate links to your content or assets do following:
Set $url in app/config/app.php to your base URL:
1'url' => 'http://localhost/example', Use it everywhere with URL::to(), for example:
1echo URL::to('assets/css/general.css'); 2/* outputs http://localhost/example/assets/css/general.css */
Blog
Remove public from URL Laravel 4
Move all content of (files in) public/ folder one level above (to the base)
Fix paths in index.php:
1require __DIR__.'/bootstrap/autoload.php'; 2$app = require_once __DIR__.'/bootstrap/start.php'; Fix path in bootstrap/paths.php:
1'public' => __DIR__.'/..', Done
Source
Blog
Some String Functions in R, String Manipulation in R
I have programmed with Perl, Python, and PHP before, and string manipulation was more direct and easier in them than in R. But still there are useful functions for string manipulation in R. I’m not an expert in R but I’ve been dealing with it for a while and I’ve learned some good functions for this purpose.
Concatenate strings
Concatenation is done with paste function. It gets concatenated strings as arguments separated bu comma and also separator character(s).
Blog
Retrieving Data with AJAX using jQuery, PHP and MySQL
Last semester, I took a course from Informatics Institute at METU called “Biological Databases and Data Analysis Tools” where first we learned what is a database and how to do queries on it. Also, the technology behind databases are taught. Then, we learned many biological databases and data analysis tools available. These include gene, protein and pathway databases, tools for creating databases.
As a final project, we were asked to create an online tool that can search a database and get the data and display it on any web browsers.
Tag: phpmyadmin
Blog
Install Apache2, PHP5, MySQL & phpMyAdmin on Ubuntu 12.04
First, install apache2:
sudo apt-get install apache2 Then, for it to work: sudo service apache2 restart
For custom www folder:
sudo cp /etc/apache2/sites-available/default /etc/apache2/sites-available/www gksudo gedit /etc/apache2/sites-available/www Change DocumentRoot and Directory directive to point to new location. For example, /home/user/www/
Save and see (link here clean URLs not working Laravel 4)
Make www default and disable default:
sudo a2dissite default && sudo a2ensite www sudo service apache2 restart Create new file in www
Tag: perl
Blog
Install Perl DBI Module on Ubuntu 12.04
On Terminal, run:
sudo apt-get install libdbi-perl Source
Blog
Install Ensembl API and BioPerl 1.2.3 on Your System
I’m going to work on a project that requires lots of queries on Ensembl databases so I wanted to install Ensembl API to begin with. Since it’s programmed in Perl, I will be using Perl in this project.
There is a nice tutorial on Ensembl website for API installation. Here I will describe some steps.
1. Download the API and BioPerl
Go to Ensembl FTP ftp://ftp.ensembl.org/pub/ and download “ensembl-api.tar.gz” or click here
Blog
Some String Functions in R, String Manipulation in R
I have programmed with Perl, Python, and PHP before, and string manipulation was more direct and easier in them than in R. But still there are useful functions for string manipulation in R. I’m not an expert in R but I’ve been dealing with it for a while and I’ve learned some good functions for this purpose.
Concatenate strings
Concatenation is done with paste function. It gets concatenated strings as arguments separated bu comma and also separator character(s).
Blog
SRS'de Coklu Arama Yapmak
Inceleme yapan scriptin en son hali, oncekilere gore daha fazla okuma inceliyor oldugu icin her okuma icin SRS uzerinde isim aramak oldukca zaman alan bir islemdi. Oyle ki, son inceleme 4 gun surdu.
Bunu azaltmak icin inceleme scriptini tamamen degistirdim. Oncelikle her zaman oldugu gibi esik degerini gecenleri aliyor ama direkt bunlarin ID numaralarini bir dizide (array) listeliyorum. Daha sonra bu listenin herbir elemanini boru karakteri ile ayirarak bir string haline getiriyorum.
Blog
MegaBLAST Sonuclarini Incelemek - Parsing
Pipeline’da son asama, aranan dizilerin urettigi ciktilari baska bir script ile incelemek. Bu islemle herbir megablast dosyasi okunuyor, ve dizilerin name, identity, overlapping length gibi parametrelerinin degerleri saklanarak amaca yonelik sekilde ekrana yazdiriliyor.
Projemde HUSAR paketinde bulunan ve yukarida bahsettigim alanlari bana dizi olarak donduren Inslink adinda bir parser kullaniyorum. Bu parserin yaptigi tek sey, dosyayi okumak ve dosyadaki istenen alanlarin degerlerini saklamak.
Daha sonra ben bu saklanan degerleri, koda eklemeler yaparak gosteriyorum ve birkac ek kod ile de ihtiyacim olan anlamli sonuclar gosteriyorum.
Blog
Kalite Satirinin Degerlendirilmesi - Quality Filter
Kirleten organizma (konaminant) analizi yapacak olan pipeline’i daha fazla gelistirmek, daha anlamli sonuclar elde etmek icin ilk adimlara (henuz fastq dosyasini isliyorken) kalite filtresi eklemeyi dusunduk. Boylece belirli bir esik degerinden dusuk okumalari daha o asamadan filtreleyerek daha guvenilir sonuclar elde elebilecegiz.
Bu kalite kontrolunu fastq dosyasinda her okumanin 4. satirini anlayarak yapacagiz. Bu 4. satir (aslinda okumanin dizileme kalite skoru), cesitli dizileme cihazlari tarafindan cesitli sekillerde yaziliyor (kodlaniyor) ve bu kodlamadan tekrar kalite skorunu elde ederek filtreleme uygulanmasi gerekiyor.
Blog
Inceleme Sonuclarini "Ambiguous" Olarak Ayirmak
Cesitli veritabanlarina karsi yaptigim aramalardan aldigim sonuclari incelerken, bunlari cesitli esik degerleri ile degerlendirmek ile beraber belirlenen esik degerlerinin uzerinde ya da altinda olan hitleri “Ambiguous” (belirsiz, cok anlamli) ya da “Unique” (essiz, tek) olarak ayirarak daha da anlamli hale getirmeye calisiyorum.
“Ambiguous” olarak, her bir megablast dosyasinda esik degerlerine uygun ancak birden fazla farkli organizmayi iceren hitleri etiketliyorum. Eger her esik degerine uygun hit, tek bir dosya icinde her zaman ayni organizmaya ait ise bu durumda yaptigim sey onu “unique” olarak etiketlemek.
Blog
Ikinci Veriseti Inceleme Sonuclari
Daha az eslenemeyen okumalara sahip ikinci verisetinin incelemesini tamamladim. Bu oncekine gore daha iyi bir dizileme ornegi oldugu icin aldigim sonuclar da oldukca tutarliydi. Insan genomuna ait bir diziden inceleme sonra asagidaki sonuclari elde ettim.
LIST OF ORGANISMS AND THEIR NUMBER OF OCCURENCES Ambiguous hit 1323 Homo sapiens 312 Pan troglodytes 25 Pongo abelii 18 Nomascus leucogenys 17 Halomonas sp. GFAJ-1 7 Callithrix jacchus 4 Macaca mulatta 3 Oryctolagus cuniculus 2 Loxodonta africana 1 Cavia porcellus 1 “Ambiguous hit” tanimini baska bir yazida aciklayacagim.
Blog
Yeni Verisetinin Incelenmesi
Pipeline’i tasarlama asamasinda deneme amacli kullandigim onceki verinin cok kotu olmasi sebebiyle yeni bir veriseti aldim. Elbette deneme asamasinda birden fazla, farkli karakterlerde verisetleri kullanmak yararlidir. Ancak onceki veriseti anlamli birkac sonuc veremeyecek kadar kotuydu diyebilirim. Ayrintilarina [buradan]({% post_url 2012-07-06-eslestirme-ve-eslesmeyen-okumalari %}) gozatabilirsiniz.
Yeni veriseti, gene bir insan genomu verisi ve BAM dosyasinin boyutu 1.8 GB ve icinde eslenebilen ve eslenemeyen okumalari bulunduruyordu. Ben bam2fastq araciyla hem bu BAM dosyasini FASTQ dosyasina cevirirken hem de eslenebilen okumalardan ayiklayarak 0.
Blog
Birden Fazla Dizi Dosyalarindan MegaBLAST'i Calistirmak
Asagidaki scripti, pipeline’in MegaBLAST aramasini daha hizli yapabilmek icin dusundugumuz bir teknige uygun olabilmesi icin yazdim. Yaptigi sey, her okuma icin olusturulmus ve formatlanmis dizi dosyalarini kullanarak veritabanlarinda belirtilen baslangic noktasi ve okuma sayisi ile arama yapmak.
1#!user/local/bin/perl 2 3$database = $ARGV[0]; 4$dir = $ARGV[1]; #directory for sequences 5$sp = $ARGV[2]; #starting point 6$n = $ARGV[3] + $sp; 7 8while (1) { 9 system("blastplus -programname=megablast $dir/read_$sp.seq $database -OUTFILE=read_$sp.megablast -nobatch -d"); 10 $sp++; 11 last if ($sp == $n); 12} Burada her sey gercekten cok basit bir programlama ile isliyor.
Blog
Tek FASTA Dosyasindan MegaBLAST'i Calistirmak - Duzenli Ifadeler
Asagida MegaBLAST’i FASTA dosyasi okuyarak calistirmak ve sonuclari bir dizinde toplayabilmek amaciyla yazdigim Perl scripti ve onun aciklamasi var. Bu script tasarlamakta oldugum pipeline’in onemli bir parcasi. Bu script ilk yazdigim olan ve sadece bir FASTA dosyasi uzerinden tum okumalara ulasabilen script.
1#!user/local/bin/perl 2$database = $ARGV[0]; 3$fasta = $ARGV[1]; #input file 4$sp = $ARGV[2]; #starting point 5$n = $ARGV[3] + $sp; 6 7if(!defined($n)){$n=12;} #set default number 8 9open FASTA, $fasta or die $!
Blog
Unix'te Perl Ile Bir Komut Ciktisini Okumak ve Duzenli Ifadeler
Daha once organizma isimlerini duzenli ifadelerle nasil cikardigimi anlatmistim. Burada, gene benzer bir seyden bahsedecegim ancak bu biraz daha fazla, ozel bir teknikle Perl’de yapilan, veri tabanindan bilgileri birden fazla satir halinde cikti olarak aldigim icin gerek duydugum cok yararli bir yontem. Mutlaka benzerini baska amaclarla da kullanabilir, yararlanabilirsiniz.
Bu ihtiyac, HUSAR gurubu tarafindan olusturulan honest veritabaninin organizma isimlerini direkt sunmamasi ancak birkac satir halinde gostermesi sebebiyle dogdu. Asagida bunun ornegini gorebilirsiniz.
Blog
MegaBLAST Aramasini Hizlandirma
Son zamanlarda sadece farkli veritabanlarinda, MegaBLAST’i en cabuk ve etkili bir sekilde calistirmanin yolunu ariyorum ve FASTA dosyasi olusturma asamasinda, gercekten cokca ise yarayan bir yontem danismanim tarafindan geldi.
Daha once tum dizilerin bulundugu tek bir FASTA dosyasindan arama yapiyordum ve bu zaman kaybina yol aciyordu. Her ne kadar dosya bir sefer acilsa da her seferinde dosya icinde satirlara gidip onu okuman, zaman alan bir islem. Bunu, dosyadaki her okumayi, ayri bir FASTA dosyasi haline getirerek cozduk.
Blog
FASTQ'dan FASTA'ya Donusturme Perl Scripti
FASTQ ve FASTA formatlari aslinda ayni bilgiyi iceren ancak birinde sadece herbir dizi icin iki satir daha az bilginin bulundugu dosya formatlari. Projemde onemli olan diger bir farklari ise FASTA formatinin direkt olarak MegaBLAST arama yapilabilmesi. Iste bu yuzden, genetik dizilim yapan makinelerin olusturdugu FASTQ formatini FASTA’ya cevirmem gerekiyor. Ve bu script pipeline’in ilk adimi.
Aslinda deneme amacli aldigim genetk dizilimin, bana bunu ulastiran tarafindan eslestirmesinin yapilmadigi icin, bir on adim olarak bu eslestirmeyi yapmistim.
Blog
İlk Adım: Eşleşmeyen Okumaları Elde Etmek
Projemin ilk kısmı daha önce bahsettiğim gibi eşleşmeyen okumaları (unmapped reads) FASTQ dosyasından çıkarmak. Böylece, daha sonraki analizler için elimdeki ihtiyacım olmayan dizileri çıkarmış ve bu analizlerdeki iş yükünü azaltmış oluyorum.
Başından beri hedefim, tüm projeyi adım adım gerçekleştiren bir pipeline tasarlamak olduğu için bu işlemi bir Perl scripti ile yapacağım. Bu script pipeline’in ilk scripti ve laboratuvardan gelecek ham (raw) FASTQ formatındaki verinin girdi (input) olarak kullanılacağı yer. Aslında bu scripte ihtiyacım olmayacak, sadece elimdeki verinin eşlenebilen verileri de içermesi sebebiyle bu adımı ekledim.
Blog
FASTQ Formatı - FASTQ Dosyası
Bugün programı oluştururken kullanacağım “test” dizilimini aldım. İki adet FASTQ dosyasından oluşuyor, her biri sıkıştırılmış ama buna rağmen boyutları 6 GB civarı. Ben elbette çok zaman kaybetmek istemediğim için bu dosyalardan birinin sadece bir kısmını kullanacağım.
Amacım, bu FASTQ dosyalarındaki eşleşebilen okumaları BWA aracı ile bularak, daha sonra onları çıkarmak. Ve kalan eşleşemeyen okumaları MegaBLAST aracının anlayabileceği bir dilde (FASTA formatında) kaydetmek.
Bu arada tüm projeyi bir Unix bilgisayarda hazırladığım için birçok komut öğreniyorum, daha sonra bunları ayrıca yazmaya çalışacağım.
Blog
Dizileme Çalışmalarını Kirleten Organizmaları Tespit Etme
Bu yaz stajımda ilk olarak başlayacağım çalışma yavaş yavaş şekilleniyor. Bu çalışmada bir pipeline oluşturup, bunu laboratuvarlarda dizileme (sequencing) örneklerini kirleten organizmaları bulmaya çalışacağım.
Laboratuvarlarda birçok nedenden dolayı örnekler başka organizmalar ya da yabancı DNA tarafından kirlenebiliyor. Bunlar bakteri, maya olabilir ya da bir virüs DNA’sı da olabilir. Siz bir DNA’yı diziledikten sonra onun referansıyla eşleştirme çok az oranda çıkabiliyor. Bu da yabancı DNA’nın olabileceğini gösteriyor. Bir başka neden referans DNA’nın farklı olması da olabilir.
Tag: perl dbi
Blog
Install Perl DBI Module on Ubuntu 12.04
On Terminal, run:
sudo apt-get install libdbi-perl Source
Tag: bluetooth
Blog
Start Ubuntu 12.04 Bluetooth Off
On Terminal:
sudo gedit /etc/rc.local Add following before the line “exit 0”
rfkill block bluetooth Save
Source
Tag: steam
Blog
Install Steam on Ubuntu 12.04
Download steam_latest.deb at:
http://repo.steampowered.com/steam/archive/precise/steam_latest.deb Double click and open it on Ubuntu Software Center and click Install
It’ll start Terminal and ask password for sudo because there are some packages required, enter your password and continue
Next it’ll update itself
Done
Source
Tag: hibernation
Blog
Enable Hibernation for Lenovo Z500 on Ubuntu 12.04
Using Terminal add this file:
sudo gedit /etc/polkit-1/localauthority/50-local.d/com.ubuntu.enable-hibernate.pkla This:
[Re-enable hibernate by default] Identity=unix-user:* Action=org.freedesktop.upower.hibernate ResultActive=yes Save & reboot
Source
Tag: lenovo
Blog
Enable Hibernation for Lenovo Z500 on Ubuntu 12.04
Using Terminal add this file:
sudo gedit /etc/polkit-1/localauthority/50-local.d/com.ubuntu.enable-hibernate.pkla This:
[Re-enable hibernate by default] Identity=unix-user:* Action=org.freedesktop.upower.hibernate ResultActive=yes Save & reboot
Source
Blog
Hotkeys (special keys) Volume/Brightness Controls Don't Work After Suspend
What seems to solve this problem on Ubuntu 12.04 LTS (Lenovo Z500):
Open this file:
sudo gedit /etc/default/grub Modify the line as this:
GRUB_CMDLINE_LINUX="noapic" Close it and run the following:
sudo update-grub Restart your computer
Source
Blog
Suspend Laptop When Lid Closed Ubuntu 12.04 LTS in Lenovo Z500
I guess this is a bug. Although suspend is set in Power settings, it doesn’t suspend the laptop when its lid is closed.
To solve it, I’ve found a workaround on web. Here is how you implement it:
Generate folder if it’s not present:
sudo mkdir /etc/acpi/local Set its permissions:
sudo chmod 755 /etc/acpi/local Generate the script:
sudo gedit /etc/acpi/local/lid.sh.post Copy-paste the following:
#!/bin/bash grep -q closed /proc/acpi/button/lid/*/state if [ $?
Tag: spotify
Blog
Install Spotify on Ubuntu 12.04
Start Software Sources from Dash Home
Add following in Other Sources tab:
deb http://repository.spotify.com stable non-free Close Software Sources
Add Spotify repo key on Terminal:
sudo apt-key adv –keyserver keyserver.ubuntu.com –recv-keys 94558F59 Install Spotify on Terminal:
sudo apt-get update && sudo apt-get install spotify-client Find Spotify in Dash Home
Source
Tag: brightness
Blog
Save Brightness Settings Ubuntu 12.04 LTS
If your laptop starts with minimized or maximized brightness and you want to have a fixed default value for that do following:
Run terminal and type to get maximum brightness:
cat /sys/class/backlight/acpi_video0/max_brightness Now set the brightness as you want and run following which give you the value for current setting:
cat /sys/class/backlight/acpi_video0/brightness Edit /etc/rc.local to have that value as default after each reboot / start:
sudo gedit /etc/rc.local Add this line before exit 0:
Tag: brightness controls
Blog
Hotkeys (special keys) Volume/Brightness Controls Don't Work After Suspend
What seems to solve this problem on Ubuntu 12.04 LTS (Lenovo Z500):
Open this file:
sudo gedit /etc/default/grub Modify the line as this:
GRUB_CMDLINE_LINUX="noapic" Close it and run the following:
sudo update-grub Restart your computer
Source
Tag: volume controls
Blog
Hotkeys (special keys) Volume/Brightness Controls Don't Work After Suspend
What seems to solve this problem on Ubuntu 12.04 LTS (Lenovo Z500):
Open this file:
sudo gedit /etc/default/grub Modify the line as this:
GRUB_CMDLINE_LINUX="noapic" Close it and run the following:
sudo update-grub Restart your computer
Source
Tag: laravel
Blog
session_start() Permission denied (13) Laravel 4
Solve it by running following lines:
chmod -R 755 /path/to/your/laravel/directory chmod -R o+w /path/to/your/laravel/directory And/or maybe:
sudo chown -R www-data:user /path/to/your/laravel/directory
Blog
If clean URLs don't work in Laravel 4 on Ubuntu 12.04 LTS
.htaccess directions are correct, mod_rewrite is enabled but still you are getting 404 Not Found errors…
You need to change AllowOverride None to AllowOverride All in /etc/apache2/sites-available/default.
Modified section in the file:
<Directory /home/user/www/> Options Indexes FollowSymLinks MultiViews AllowOverride All Order allow,deny allow from all </Directory>
Blog
Permission Issues develop Laravel 4 on Ubuntu 12.04 LTS
If your CSS or JS files don’t seem to load or you get 403 Forbidden or Permissions denied, all you need to do is to run following on terminal:
sudo chmod -R 755 /path/to/your/laravel/directory
Blog
Base URL for Your Laravel 4 Website
To get base URL of your website to generate links to your content or assets do following:
Set $url in app/config/app.php to your base URL:
1'url' => 'http://localhost/example', Use it everywhere with URL::to(), for example:
1echo URL::to('assets/css/general.css'); 2/* outputs http://localhost/example/assets/css/general.css */
Blog
Remove public from URL Laravel 4
Move all content of (files in) public/ folder one level above (to the base)
Fix paths in index.php:
1require __DIR__.'/bootstrap/autoload.php'; 2$app = require_once __DIR__.'/bootstrap/start.php'; Fix path in bootstrap/paths.php:
1'public' => __DIR__.'/..', Done
Source
Tag: session_start
Blog
session_start() Permission denied (13) Laravel 4
Solve it by running following lines:
chmod -R 755 /path/to/your/laravel/directory chmod -R o+w /path/to/your/laravel/directory And/or maybe:
sudo chown -R www-data:user /path/to/your/laravel/directory
Tag: executable
Blog
How To Make A File or Script Executable in Ubuntu
Start terminal CTRL + Alt + T can be used (or just go to Dash Home and type Terminal):
Run this command below:
sudo chmod +x /path/to/your/file Source
Tag: suspend
Blog
Suspend Laptop When Lid Closed Ubuntu 12.04 LTS in Lenovo Z500
I guess this is a bug. Although suspend is set in Power settings, it doesn’t suspend the laptop when its lid is closed.
To solve it, I’ve found a workaround on web. Here is how you implement it:
Generate folder if it’s not present:
sudo mkdir /etc/acpi/local Set its permissions:
sudo chmod 755 /etc/acpi/local Generate the script:
sudo gedit /etc/acpi/local/lid.sh.post Copy-paste the following:
#!/bin/bash grep -q closed /proc/acpi/button/lid/*/state if [ $?
Tag: api
Blog
Install Ensembl API and BioPerl 1.2.3 on Your System
I’m going to work on a project that requires lots of queries on Ensembl databases so I wanted to install Ensembl API to begin with. Since it’s programmed in Perl, I will be using Perl in this project.
There is a nice tutorial on Ensembl website for API installation. Here I will describe some steps.
1. Download the API and BioPerl
Go to Ensembl FTP ftp://ftp.ensembl.org/pub/ and download “ensembl-api.tar.gz” or click here
Tag: bashrc
Blog
Install Ensembl API and BioPerl 1.2.3 on Your System
I’m going to work on a project that requires lots of queries on Ensembl databases so I wanted to install Ensembl API to begin with. Since it’s programmed in Perl, I will be using Perl in this project.
There is a nice tutorial on Ensembl website for API installation. Here I will describe some steps.
1. Download the API and BioPerl
Go to Ensembl FTP ftp://ftp.ensembl.org/pub/ and download “ensembl-api.tar.gz” or click here
Tag: bioperl
Blog
Install Ensembl API and BioPerl 1.2.3 on Your System
I’m going to work on a project that requires lots of queries on Ensembl databases so I wanted to install Ensembl API to begin with. Since it’s programmed in Perl, I will be using Perl in this project.
There is a nice tutorial on Ensembl website for API installation. Here I will describe some steps.
1. Download the API and BioPerl
Go to Ensembl FTP ftp://ftp.ensembl.org/pub/ and download “ensembl-api.tar.gz” or click here
Tag: gedit
Blog
Install Ensembl API and BioPerl 1.2.3 on Your System
I’m going to work on a project that requires lots of queries on Ensembl databases so I wanted to install Ensembl API to begin with. Since it’s programmed in Perl, I will be using Perl in this project.
There is a nice tutorial on Ensembl website for API installation. Here I will describe some steps.
1. Download the API and BioPerl
Go to Ensembl FTP ftp://ftp.ensembl.org/pub/ and download “ensembl-api.tar.gz” or click here
Tag: install bioperl
Blog
Install Ensembl API and BioPerl 1.2.3 on Your System
I’m going to work on a project that requires lots of queries on Ensembl databases so I wanted to install Ensembl API to begin with. Since it’s programmed in Perl, I will be using Perl in this project.
There is a nice tutorial on Ensembl website for API installation. Here I will describe some steps.
1. Download the API and BioPerl
Go to Ensembl FTP ftp://ftp.ensembl.org/pub/ and download “ensembl-api.tar.gz” or click here
Tag: lib
Blog
Install Ensembl API and BioPerl 1.2.3 on Your System
I’m going to work on a project that requires lots of queries on Ensembl databases so I wanted to install Ensembl API to begin with. Since it’s programmed in Perl, I will be using Perl in this project.
There is a nice tutorial on Ensembl website for API installation. Here I will describe some steps.
1. Download the API and BioPerl
Go to Ensembl FTP ftp://ftp.ensembl.org/pub/ and download “ensembl-api.tar.gz” or click here
Tag: clean urls
Blog
If clean URLs don't work in Laravel 4 on Ubuntu 12.04 LTS
.htaccess directions are correct, mod_rewrite is enabled but still you are getting 404 Not Found errors…
You need to change AllowOverride None to AllowOverride All in /etc/apache2/sites-available/default.
Modified section in the file:
<Directory /home/user/www/> Options Indexes FollowSymLinks MultiViews AllowOverride All Order allow,deny allow from all </Directory>
Tag: file browser
Blog
A Nice File Browser for Geany 1.23 on Ubuntu 12.04 LTS
If you’re looking for a file browser for Geany, check out TreeBrowser plugin on its page (see the page for screenshots).
To install and enable, just run following o Terminal:
sudo apt-get install geany-plugin-treebrowser And go to “Tools” -> “Plugin Manager”, check “TreeBrowser”
Source
Tag: 403 forbidden
Blog
Permission Issues develop Laravel 4 on Ubuntu 12.04 LTS
If your CSS or JS files don’t seem to load or you get 403 Forbidden or Permissions denied, all you need to do is to run following on terminal:
sudo chmod -R 755 /path/to/your/laravel/directory
Tag: base url
Blog
Base URL for Your Laravel 4 Website
To get base URL of your website to generate links to your content or assets do following:
Set $url in app/config/app.php to your base URL:
1'url' => 'http://localhost/example', Use it everywhere with URL::to(), for example:
1echo URL::to('assets/css/general.css'); 2/* outputs http://localhost/example/assets/css/general.css */
Tag: remove public
Blog
Remove public from URL Laravel 4
Move all content of (files in) public/ folder one level above (to the base)
Fix paths in index.php:
1require __DIR__.'/bootstrap/autoload.php'; 2$app = require_once __DIR__.'/bootstrap/start.php'; Fix path in bootstrap/paths.php:
1'public' => __DIR__.'/..', Done
Source
Tag: data-derived network
Blog
Last Submissions to the Challenge
Today, I submitted in silico and experimental data network inference results on Synapse for the next leaderboard on this Wednesday.
For experimental part, I had to exclude edges with FGFR1 and FGFR3 because the data lacks phosphorylated forms of these proteins and networks must be constructed using only phosphoproteins in the data.
Since there was an update for in silico part, I had to modify the script and resubmit the results.
Blog
Network Visualization Using Cytoscape
Cytoscape is a nice tool to visualize network for better understanding and delivery. I used it for in silico data network visualization and the result was really pretty. Now, I have networks constructed using experimental data from HPN-DREAM Challenge.
In this post, I want to demonstrate how to visualize a network with scores. I’m using Cytoscape 2.8 on Ubuntu 12.
First, the network will be read from a SIF file which is default format of Cytoscape for networks.
Blog
Plotting Expression Curves for Experimental Data
As I can plot expression curves for in silico data. I moved on experimental data which is more complex and larger. This data is the result of RPPA experiments on different breast cancer cell lines and it includes protein abundance measurements for about 45 phophoproteins. These phosphoproteins are treated with different inhibitors and stimuli and by comparing their expressions, I will try to infer relations between them.
Before moving on inferring part, I want to have a script that can plot the graphs so that I can see particular results for specific cases.
Blog
In silico Network Inference Last Improvements and Visualization of Result in Cytoscape
I’m almost done with the analysis of in silico data, although I need to decide if I need further analysis with the inhibiting parent nodes in the network. Last, I couldn’t filter out duplicate edges, which were scored differently. Now, with some improvements in the script, low scores duplicates are filtered and there is a better final list of edges which is ready to be visualized.
I also tried visualizing it on Cytoscape.
Blog
Plotting Expression Profiles Data Analysis for Network Inference
For in silico data network inference I decided to develop a script because the existing tools have bugs and they are not compatible with the data. At the same time, I will try to report bugs and the compatibility issues to developers.
in silico data has 660 experiment results of 20 antibodies, 4 kinds of stimuli and 3 kinds of inhibitors. Antibodies are treated with a stimulus, say at t_0 and in the case of inhibitors, say at t_i, antibodies are pre-incubated for some time (t_pre) and then, treated with a stimulus.
Blog
Network Inference Challenge in silico Data
I had a meeting with BiGCaT this week and we discussed DREAM Breast Cancer Challenge. I presented the challenge and also some ways that I have found to solve the first sub-challenge network inference. Tina, from BiGCaT, suggested starting with in silico data which is much simpler than breast cancer data. Later, I can use the methods I develop for in silico data in experimental data.
in silico data contains 20 antibodies, 3 inhibitors and 2 ligand stimuli with 2 different concentration for each.
Tag: dream challenge
Blog
Last Submissions to the Challenge
Today, I submitted in silico and experimental data network inference results on Synapse for the next leaderboard on this Wednesday.
For experimental part, I had to exclude edges with FGFR1 and FGFR3 because the data lacks phosphorylated forms of these proteins and networks must be constructed using only phosphoproteins in the data.
Since there was an update for in silico part, I had to modify the script and resubmit the results.
Blog
Network Visualization Using Cytoscape
Cytoscape is a nice tool to visualize network for better understanding and delivery. I used it for in silico data network visualization and the result was really pretty. Now, I have networks constructed using experimental data from HPN-DREAM Challenge.
In this post, I want to demonstrate how to visualize a network with scores. I’m using Cytoscape 2.8 on Ubuntu 12.
First, the network will be read from a SIF file which is default format of Cytoscape for networks.
Blog
Plotting Expression Curves for Experimental Data
As I can plot expression curves for in silico data. I moved on experimental data which is more complex and larger. This data is the result of RPPA experiments on different breast cancer cell lines and it includes protein abundance measurements for about 45 phophoproteins. These phosphoproteins are treated with different inhibitors and stimuli and by comparing their expressions, I will try to infer relations between them.
Before moving on inferring part, I want to have a script that can plot the graphs so that I can see particular results for specific cases.
Blog
Experimental Data Optimization for Network Inference
As I mentioned in my previous post, experimental data from the challenge has missing data values that create problems during analyses. To solve it, first thing I did was to optimize data, which includes detecting missing conditions and putting NAs for data values and sorting them if necessary.
I wrote two functions in the script. First one ranks the data according to the fashion and sorts it based on these ranks.
Blog
Working with Experimental Data from Network Inference Challenge
As I almost finished with in silico data, I moved on to analyses of experimental data using the same script. But since the characteristics of data is somehow different, before inferring network, I need to modify the script to be able to read experimental data files.
These differences include missing data values for some conditions. This makes analyses difficult because I have to estimate a value for them and this will decrease the confidence score of edges.
Blog
In silico Network Inference Last Improvements and Visualization of Result in Cytoscape
I’m almost done with the analysis of in silico data, although I need to decide if I need further analysis with the inhibiting parent nodes in the network. Last, I couldn’t filter out duplicate edges, which were scored differently. Now, with some improvements in the script, low scores duplicates are filtered and there is a better final list of edges which is ready to be visualized.
I also tried visualizing it on Cytoscape.
Blog
Latest Progress on Network Inference and Edge Scoring
I have improved network inference part of the script slightly by changing the way of comparing intervention (presence of inhibitor and stimulus) and no intervention (presence of stimulus) data from in silico part.
Now, I’m using a function (simp) from an R package called StreamMetabolism, which gets time points and data values and (does integration) calculates the area under the curve (Sefick, 2009). I do this integration for both condition and then I compare them.
Blog
Scoring Edges Network Inference HPN-DREAM Challenge
Yesterday, I managed to infer a network for some part of in silico data from the challenge. Since the challenge also asks for scoring the edges in networks, I developed the script further and add a function for that.
edgeScorer function gets data object of average time points for each curve in intervention/no-intervention sets and scores each edge for each set of conditions. For this, first, it looks for the largest difference among the sets and set it as maxDifference and later, it stores differences divided by maxDifference in another data object.
Blog
Determining Edges More Progress on Network Inference
Lately, I have been writing an R script to infer network using in silico data. Last version of the script was reading MIDAS file and plotting expression profiles. I have modified it and now it reads MIDAS file, does some analyses and prints causal relations to a file. This file is a SIF file as required.
This dataset is generated with 20 antibodies but only 3 of them are perturbed. Also, for one, stimulus is missing.
Blog
Plotting Expression Profiles Data Analysis for Network Inference
For in silico data network inference I decided to develop a script because the existing tools have bugs and they are not compatible with the data. At the same time, I will try to report bugs and the compatibility issues to developers.
in silico data has 660 experiment results of 20 antibodies, 4 kinds of stimuli and 3 kinds of inhibitors. Antibodies are treated with a stimulus, say at t_0 and in the case of inhibitors, say at t_i, antibodies are pre-incubated for some time (t_pre) and then, treated with a stimulus.
Blog
Webinar on HPN-DREAM Breast Cancer Network Inference Challenge
DREAM8 organizers plan a webinar about HPN-DREAM Breast Cancer Network Inference Challenge on July 19, at 10:30 - 11:30 (PDT / UTC -7). General setup of the challenge, demo submissions to the leaderboard will be discussed and also questions about the challenge will be accepted during webinar. The number of the participants to the challenge is also announced: 138.
Registration to the webinar is done using this form. There are limited number of “seats”, but later recordings will be published.
Blog
Network Inference Challenge in silico Data
I had a meeting with BiGCaT this week and we discussed DREAM Breast Cancer Challenge. I presented the challenge and also some ways that I have found to solve the first sub-challenge network inference. Tina, from BiGCaT, suggested starting with in silico data which is much simpler than breast cancer data. Later, I can use the methods I develop for in silico data in experimental data.
in silico data contains 20 antibodies, 3 inhibitors and 2 ligand stimuli with 2 different concentration for each.
Blog
Playing around with CellNOptR Tool and MIDAS File
With CellNOptR, we will try to construct network models for the challenge. For this, the tool needs two inputs. First one is a special data object called CNOlist that stores vectors and matrices of data. Second one is a .SIF file that contains prior knowledge network which can be obtained from pathway database and analysis tools.
CNOlist contains following fields: namesSignals, namesCues, namesStimuli and namesInhibitors, which are vectors storing the names of measurements.
Blog
Progress on Network Inference Sub-Challenge
This sub-challenge has several requirements:
Directed and causal edges on the models (32 models - 4 cell lines × 8 stimuli) Edges should be scored (normalizing to range between 0 and 1) that will show confidence Nodes will be phosphoproteins from the data Prior knowledge network (that can be constructed using pathway databases) is allowed to be used (actually this is a must for some network inference tools) First thing was to look for existing tools.
Blog
Network Inference DREAM Breast Cancer Challenge
The inference of causal edges are described as the change on a node seen after the intervention of another node. If the curves obtained over time overlap (under intervention or no intervention), then there is no relation. Otherwise, we can draw an edge between those nodes and according to the level, up or down, the edge will be activating or inhibiting. These causal edges are context-specific so in different cell line data, we may have different relations.
Blog
DREAM Breast Cancer Sub-challenges
I have been going over the sub-challenges before attempting to solve them. As I mentioned, there are three sub-challenges and somehow they are connected.
First, using given data and other possible data sources such as pathway databases, the causal signaling network of the phosphoproteins. There are 4 cell lines and 8 stimulus so they make total 32 networks at the end. Nodes are phosphoproteins and edges should be directed and causal (activator or inhibitor).
Blog
HPN-DREAM Breast Cancer Network Inference Challenge
Understanding signaling networks might bring more insights on cancer treatment because cells respond to their environment by activating these networks and phosphorylation reactions play important roles in these networks.
The goal of this challenge is to advance our ability and knowledge on signaling networks inference and protein phosphorylation dynamics prediction. Also, we are asked to develop a visualization method for the data.
The dataset provided is extensive and a result of RPPA (reverse-phase protein array) experiments.
Blog
Dream Challenge
This year, 8th Dream Challenge takes place and I will be working on this project as my internship job in BiGCaT, Bioinformatics, UM. The challenge brings scientists to catalyze the interaction between experiment and theory in the area of cellular network inference and quantitative model building in systems biology (as said on their webpage).
In this competition, I will work on a specific challenge about network modeling, dynamic response predictions and data visualization.
Tag: hpn-dream
Blog
Last Submissions to the Challenge
Today, I submitted in silico and experimental data network inference results on Synapse for the next leaderboard on this Wednesday.
For experimental part, I had to exclude edges with FGFR1 and FGFR3 because the data lacks phosphorylated forms of these proteins and networks must be constructed using only phosphoproteins in the data.
Since there was an update for in silico part, I had to modify the script and resubmit the results.
Blog
Network Visualization Using Cytoscape
Cytoscape is a nice tool to visualize network for better understanding and delivery. I used it for in silico data network visualization and the result was really pretty. Now, I have networks constructed using experimental data from HPN-DREAM Challenge.
In this post, I want to demonstrate how to visualize a network with scores. I’m using Cytoscape 2.8 on Ubuntu 12.
First, the network will be read from a SIF file which is default format of Cytoscape for networks.
Blog
Plotting Expression Curves for Experimental Data
As I can plot expression curves for in silico data. I moved on experimental data which is more complex and larger. This data is the result of RPPA experiments on different breast cancer cell lines and it includes protein abundance measurements for about 45 phophoproteins. These phosphoproteins are treated with different inhibitors and stimuli and by comparing their expressions, I will try to infer relations between them.
Before moving on inferring part, I want to have a script that can plot the graphs so that I can see particular results for specific cases.
Blog
Experimental Data Optimization for Network Inference
As I mentioned in my previous post, experimental data from the challenge has missing data values that create problems during analyses. To solve it, first thing I did was to optimize data, which includes detecting missing conditions and putting NAs for data values and sorting them if necessary.
I wrote two functions in the script. First one ranks the data according to the fashion and sorts it based on these ranks.
Blog
Working with Experimental Data from Network Inference Challenge
As I almost finished with in silico data, I moved on to analyses of experimental data using the same script. But since the characteristics of data is somehow different, before inferring network, I need to modify the script to be able to read experimental data files.
These differences include missing data values for some conditions. This makes analyses difficult because I have to estimate a value for them and this will decrease the confidence score of edges.
Blog
In silico Network Inference Last Improvements and Visualization of Result in Cytoscape
I’m almost done with the analysis of in silico data, although I need to decide if I need further analysis with the inhibiting parent nodes in the network. Last, I couldn’t filter out duplicate edges, which were scored differently. Now, with some improvements in the script, low scores duplicates are filtered and there is a better final list of edges which is ready to be visualized.
I also tried visualizing it on Cytoscape.
Blog
Latest Progress on Network Inference and Edge Scoring
I have improved network inference part of the script slightly by changing the way of comparing intervention (presence of inhibitor and stimulus) and no intervention (presence of stimulus) data from in silico part.
Now, I’m using a function (simp) from an R package called StreamMetabolism, which gets time points and data values and (does integration) calculates the area under the curve (Sefick, 2009). I do this integration for both condition and then I compare them.
Blog
Scoring Edges Network Inference HPN-DREAM Challenge
Yesterday, I managed to infer a network for some part of in silico data from the challenge. Since the challenge also asks for scoring the edges in networks, I developed the script further and add a function for that.
edgeScorer function gets data object of average time points for each curve in intervention/no-intervention sets and scores each edge for each set of conditions. For this, first, it looks for the largest difference among the sets and set it as maxDifference and later, it stores differences divided by maxDifference in another data object.
Tag: in silico
Blog
Last Submissions to the Challenge
Today, I submitted in silico and experimental data network inference results on Synapse for the next leaderboard on this Wednesday.
For experimental part, I had to exclude edges with FGFR1 and FGFR3 because the data lacks phosphorylated forms of these proteins and networks must be constructed using only phosphoproteins in the data.
Since there was an update for in silico part, I had to modify the script and resubmit the results.
Blog
Working with Experimental Data from Network Inference Challenge
As I almost finished with in silico data, I moved on to analyses of experimental data using the same script. But since the characteristics of data is somehow different, before inferring network, I need to modify the script to be able to read experimental data files.
These differences include missing data values for some conditions. This makes analyses difficult because I have to estimate a value for them and this will decrease the confidence score of edges.
Blog
In silico Network Inference Last Improvements and Visualization of Result in Cytoscape
I’m almost done with the analysis of in silico data, although I need to decide if I need further analysis with the inhibiting parent nodes in the network. Last, I couldn’t filter out duplicate edges, which were scored differently. Now, with some improvements in the script, low scores duplicates are filtered and there is a better final list of edges which is ready to be visualized.
I also tried visualizing it on Cytoscape.
Blog
Scoring Edges Network Inference HPN-DREAM Challenge
Yesterday, I managed to infer a network for some part of in silico data from the challenge. Since the challenge also asks for scoring the edges in networks, I developed the script further and add a function for that.
edgeScorer function gets data object of average time points for each curve in intervention/no-intervention sets and scores each edge for each set of conditions. For this, first, it looks for the largest difference among the sets and set it as maxDifference and later, it stores differences divided by maxDifference in another data object.
Blog
Determining Edges More Progress on Network Inference
Lately, I have been writing an R script to infer network using in silico data. Last version of the script was reading MIDAS file and plotting expression profiles. I have modified it and now it reads MIDAS file, does some analyses and prints causal relations to a file. This file is a SIF file as required.
This dataset is generated with 20 antibodies but only 3 of them are perturbed. Also, for one, stimulus is missing.
Blog
Plotting Expression Profiles Data Analysis for Network Inference
For in silico data network inference I decided to develop a script because the existing tools have bugs and they are not compatible with the data. At the same time, I will try to report bugs and the compatibility issues to developers.
in silico data has 660 experiment results of 20 antibodies, 4 kinds of stimuli and 3 kinds of inhibitors. Antibodies are treated with a stimulus, say at t_0 and in the case of inhibitors, say at t_i, antibodies are pre-incubated for some time (t_pre) and then, treated with a stimulus.
Blog
Network Inference Challenge in silico Data
I had a meeting with BiGCaT this week and we discussed DREAM Breast Cancer Challenge. I presented the challenge and also some ways that I have found to solve the first sub-challenge network inference. Tina, from BiGCaT, suggested starting with in silico data which is much simpler than breast cancer data. Later, I can use the methods I develop for in silico data in experimental data.
in silico data contains 20 antibodies, 3 inhibitors and 2 ligand stimuli with 2 different concentration for each.
Blog
Progress on Network Inference Sub-Challenge
This sub-challenge has several requirements:
Directed and causal edges on the models (32 models - 4 cell lines × 8 stimuli) Edges should be scored (normalizing to range between 0 and 1) that will show confidence Nodes will be phosphoproteins from the data Prior knowledge network (that can be constructed using pathway databases) is allowed to be used (actually this is a must for some network inference tools) First thing was to look for existing tools.
Blog
HPN-DREAM Breast Cancer Network Inference Challenge
Understanding signaling networks might bring more insights on cancer treatment because cells respond to their environment by activating these networks and phosphorylation reactions play important roles in these networks.
The goal of this challenge is to advance our ability and knowledge on signaling networks inference and protein phosphorylation dynamics prediction. Also, we are asked to develop a visualization method for the data.
The dataset provided is extensive and a result of RPPA (reverse-phase protein array) experiments.
Tag: network inference
Blog
Last Submissions to the Challenge
Today, I submitted in silico and experimental data network inference results on Synapse for the next leaderboard on this Wednesday.
For experimental part, I had to exclude edges with FGFR1 and FGFR3 because the data lacks phosphorylated forms of these proteins and networks must be constructed using only phosphoproteins in the data.
Since there was an update for in silico part, I had to modify the script and resubmit the results.
Blog
Network Visualization Using Cytoscape
Cytoscape is a nice tool to visualize network for better understanding and delivery. I used it for in silico data network visualization and the result was really pretty. Now, I have networks constructed using experimental data from HPN-DREAM Challenge.
In this post, I want to demonstrate how to visualize a network with scores. I’m using Cytoscape 2.8 on Ubuntu 12.
First, the network will be read from a SIF file which is default format of Cytoscape for networks.
Blog
Plotting Expression Curves for Experimental Data
As I can plot expression curves for in silico data. I moved on experimental data which is more complex and larger. This data is the result of RPPA experiments on different breast cancer cell lines and it includes protein abundance measurements for about 45 phophoproteins. These phosphoproteins are treated with different inhibitors and stimuli and by comparing their expressions, I will try to infer relations between them.
Before moving on inferring part, I want to have a script that can plot the graphs so that I can see particular results for specific cases.
Blog
Experimental Data Optimization for Network Inference
As I mentioned in my previous post, experimental data from the challenge has missing data values that create problems during analyses. To solve it, first thing I did was to optimize data, which includes detecting missing conditions and putting NAs for data values and sorting them if necessary.
I wrote two functions in the script. First one ranks the data according to the fashion and sorts it based on these ranks.
Blog
Working with Experimental Data from Network Inference Challenge
As I almost finished with in silico data, I moved on to analyses of experimental data using the same script. But since the characteristics of data is somehow different, before inferring network, I need to modify the script to be able to read experimental data files.
These differences include missing data values for some conditions. This makes analyses difficult because I have to estimate a value for them and this will decrease the confidence score of edges.
Blog
In silico Network Inference Last Improvements and Visualization of Result in Cytoscape
I’m almost done with the analysis of in silico data, although I need to decide if I need further analysis with the inhibiting parent nodes in the network. Last, I couldn’t filter out duplicate edges, which were scored differently. Now, with some improvements in the script, low scores duplicates are filtered and there is a better final list of edges which is ready to be visualized.
I also tried visualizing it on Cytoscape.
Blog
Latest Progress on Network Inference and Edge Scoring
I have improved network inference part of the script slightly by changing the way of comparing intervention (presence of inhibitor and stimulus) and no intervention (presence of stimulus) data from in silico part.
Now, I’m using a function (simp) from an R package called StreamMetabolism, which gets time points and data values and (does integration) calculates the area under the curve (Sefick, 2009). I do this integration for both condition and then I compare them.
Blog
Scoring Edges Network Inference HPN-DREAM Challenge
Yesterday, I managed to infer a network for some part of in silico data from the challenge. Since the challenge also asks for scoring the edges in networks, I developed the script further and add a function for that.
edgeScorer function gets data object of average time points for each curve in intervention/no-intervention sets and scores each edge for each set of conditions. For this, first, it looks for the largest difference among the sets and set it as maxDifference and later, it stores differences divided by maxDifference in another data object.
Blog
Determining Edges More Progress on Network Inference
Lately, I have been writing an R script to infer network using in silico data. Last version of the script was reading MIDAS file and plotting expression profiles. I have modified it and now it reads MIDAS file, does some analyses and prints causal relations to a file. This file is a SIF file as required.
This dataset is generated with 20 antibodies but only 3 of them are perturbed. Also, for one, stimulus is missing.
Blog
Plotting Expression Profiles Data Analysis for Network Inference
For in silico data network inference I decided to develop a script because the existing tools have bugs and they are not compatible with the data. At the same time, I will try to report bugs and the compatibility issues to developers.
in silico data has 660 experiment results of 20 antibodies, 4 kinds of stimuli and 3 kinds of inhibitors. Antibodies are treated with a stimulus, say at t_0 and in the case of inhibitors, say at t_i, antibodies are pre-incubated for some time (t_pre) and then, treated with a stimulus.
Blog
Webinar on HPN-DREAM Breast Cancer Network Inference Challenge
DREAM8 organizers plan a webinar about HPN-DREAM Breast Cancer Network Inference Challenge on July 19, at 10:30 - 11:30 (PDT / UTC -7). General setup of the challenge, demo submissions to the leaderboard will be discussed and also questions about the challenge will be accepted during webinar. The number of the participants to the challenge is also announced: 138.
Registration to the webinar is done using this form. There are limited number of “seats”, but later recordings will be published.
Blog
Network Inference Challenge in silico Data
I had a meeting with BiGCaT this week and we discussed DREAM Breast Cancer Challenge. I presented the challenge and also some ways that I have found to solve the first sub-challenge network inference. Tina, from BiGCaT, suggested starting with in silico data which is much simpler than breast cancer data. Later, I can use the methods I develop for in silico data in experimental data.
in silico data contains 20 antibodies, 3 inhibitors and 2 ligand stimuli with 2 different concentration for each.
Blog
Playing around with CellNOptR Tool and MIDAS File
With CellNOptR, we will try to construct network models for the challenge. For this, the tool needs two inputs. First one is a special data object called CNOlist that stores vectors and matrices of data. Second one is a .SIF file that contains prior knowledge network which can be obtained from pathway database and analysis tools.
CNOlist contains following fields: namesSignals, namesCues, namesStimuli and namesInhibitors, which are vectors storing the names of measurements.
Blog
Progress on Network Inference Sub-Challenge
This sub-challenge has several requirements:
Directed and causal edges on the models (32 models - 4 cell lines × 8 stimuli) Edges should be scored (normalizing to range between 0 and 1) that will show confidence Nodes will be phosphoproteins from the data Prior knowledge network (that can be constructed using pathway databases) is allowed to be used (actually this is a must for some network inference tools) First thing was to look for existing tools.
Blog
Network Inference DREAM Breast Cancer Challenge
The inference of causal edges are described as the change on a node seen after the intervention of another node. If the curves obtained over time overlap (under intervention or no intervention), then there is no relation. Otherwise, we can draw an edge between those nodes and according to the level, up or down, the edge will be activating or inhibiting. These causal edges are context-specific so in different cell line data, we may have different relations.
Blog
DREAM Breast Cancer Sub-challenges
I have been going over the sub-challenges before attempting to solve them. As I mentioned, there are three sub-challenges and somehow they are connected.
First, using given data and other possible data sources such as pathway databases, the causal signaling network of the phosphoproteins. There are 4 cell lines and 8 stimulus so they make total 32 networks at the end. Nodes are phosphoproteins and edges should be directed and causal (activator or inhibitor).
Blog
HPN-DREAM Breast Cancer Network Inference Challenge
Understanding signaling networks might bring more insights on cancer treatment because cells respond to their environment by activating these networks and phosphorylation reactions play important roles in these networks.
The goal of this challenge is to advance our ability and knowledge on signaling networks inference and protein phosphorylation dynamics prediction. Also, we are asked to develop a visualization method for the data.
The dataset provided is extensive and a result of RPPA (reverse-phase protein array) experiments.
Blog
Dream Challenge
This year, 8th Dream Challenge takes place and I will be working on this project as my internship job in BiGCaT, Bioinformatics, UM. The challenge brings scientists to catalyze the interaction between experiment and theory in the area of cellular network inference and quantitative model building in systems biology (as said on their webpage).
In this competition, I will work on a specific challenge about network modeling, dynamic response predictions and data visualization.
Tag: import
Blog
Network Visualization Using Cytoscape
Cytoscape is a nice tool to visualize network for better understanding and delivery. I used it for in silico data network visualization and the result was really pretty. Now, I have networks constructed using experimental data from HPN-DREAM Challenge.
In this post, I want to demonstrate how to visualize a network with scores. I’m using Cytoscape 2.8 on Ubuntu 12.
First, the network will be read from a SIF file which is default format of Cytoscape for networks.
Tag: mapper
Blog
Network Visualization Using Cytoscape
Cytoscape is a nice tool to visualize network for better understanding and delivery. I used it for in silico data network visualization and the result was really pretty. Now, I have networks constructed using experimental data from HPN-DREAM Challenge.
In this post, I want to demonstrate how to visualize a network with scores. I’m using Cytoscape 2.8 on Ubuntu 12.
First, the network will be read from a SIF file which is default format of Cytoscape for networks.
Blog
BWA (Burrows-Wheeler Aligner) Hizalayıcı - Eşleştirici
Önceki yazımda belirttiğim gibi bir eşleştirici (aligner ya da mapper) kullanarak elimdeki verinin referans genomu ile ne derece eşlestiğini bulmaya çalışacağım. Daha sonra eşleşmeyen kısmıyla birtakım analizler yapacağım.
BWA (Burrows-Wheeler Aligner) görece kısa dizilimleri insan genomu gibi uzun referans genomlarıyla eşleştiren bir program. 200bp (bp: baz çifti) uzunluğuna kadar bwa-short algoritması, 200bp - 100kbp arası ise BWA-SW algoritması kullanılıyor.
Hizalayıcı - eşleştirici seçmede birçok faktör rol oynuyor. Birçok bu tip araç var ve farklı özelliklere sahipler.
Tag: sif
Blog
Network Visualization Using Cytoscape
Cytoscape is a nice tool to visualize network for better understanding and delivery. I used it for in silico data network visualization and the result was really pretty. Now, I have networks constructed using experimental data from HPN-DREAM Challenge.
In this post, I want to demonstrate how to visualize a network with scores. I’m using Cytoscape 2.8 on Ubuntu 12.
First, the network will be read from a SIF file which is default format of Cytoscape for networks.
Blog
Determining Edges More Progress on Network Inference
Lately, I have been writing an R script to infer network using in silico data. Last version of the script was reading MIDAS file and plotting expression profiles. I have modified it and now it reads MIDAS file, does some analyses and prints causal relations to a file. This file is a SIF file as required.
This dataset is generated with 20 antibodies but only 3 of them are perturbed. Also, for one, stimulus is missing.
Blog
Playing around with CellNOptR Tool and MIDAS File
With CellNOptR, we will try to construct network models for the challenge. For this, the tool needs two inputs. First one is a special data object called CNOlist that stores vectors and matrices of data. Second one is a .SIF file that contains prior knowledge network which can be obtained from pathway database and analysis tools.
CNOlist contains following fields: namesSignals, namesCues, namesStimuli and namesInhibitors, which are vectors storing the names of measurements.
Blog
Progress on Network Inference Sub-Challenge
This sub-challenge has several requirements:
Directed and causal edges on the models (32 models - 4 cell lines × 8 stimuli) Edges should be scored (normalizing to range between 0 and 1) that will show confidence Nodes will be phosphoproteins from the data Prior knowledge network (that can be constructed using pathway databases) is allowed to be used (actually this is a must for some network inference tools) First thing was to look for existing tools.
Blog
Network Inference DREAM Breast Cancer Challenge
The inference of causal edges are described as the change on a node seen after the intervention of another node. If the curves obtained over time overlap (under intervention or no intervention), then there is no relation. Otherwise, we can draw an edge between those nodes and according to the level, up or down, the edge will be activating or inhibiting. These causal edges are context-specific so in different cell line data, we may have different relations.
Tag: vizmapper
Blog
Network Visualization Using Cytoscape
Cytoscape is a nice tool to visualize network for better understanding and delivery. I used it for in silico data network visualization and the result was really pretty. Now, I have networks constructed using experimental data from HPN-DREAM Challenge.
In this post, I want to demonstrate how to visualize a network with scores. I’m using Cytoscape 2.8 on Ubuntu 12.
First, the network will be read from a SIF file which is default format of Cytoscape for networks.
Tag: experimental data
Blog
Plotting Expression Curves for Experimental Data
As I can plot expression curves for in silico data. I moved on experimental data which is more complex and larger. This data is the result of RPPA experiments on different breast cancer cell lines and it includes protein abundance measurements for about 45 phophoproteins. These phosphoproteins are treated with different inhibitors and stimuli and by comparing their expressions, I will try to infer relations between them.
Before moving on inferring part, I want to have a script that can plot the graphs so that I can see particular results for specific cases.
Tag: expression
Blog
Plotting Expression Curves for Experimental Data
As I can plot expression curves for in silico data. I moved on experimental data which is more complex and larger. This data is the result of RPPA experiments on different breast cancer cell lines and it includes protein abundance measurements for about 45 phophoproteins. These phosphoproteins are treated with different inhibitors and stimuli and by comparing their expressions, I will try to infer relations between them.
Before moving on inferring part, I want to have a script that can plot the graphs so that I can see particular results for specific cases.
Tag: plot
Blog
Plotting Expression Curves for Experimental Data
As I can plot expression curves for in silico data. I moved on experimental data which is more complex and larger. This data is the result of RPPA experiments on different breast cancer cell lines and it includes protein abundance measurements for about 45 phophoproteins. These phosphoproteins are treated with different inhibitors and stimuli and by comparing their expressions, I will try to infer relations between them.
Before moving on inferring part, I want to have a script that can plot the graphs so that I can see particular results for specific cases.
Tag: rppa
Blog
Plotting Expression Curves for Experimental Data
As I can plot expression curves for in silico data. I moved on experimental data which is more complex and larger. This data is the result of RPPA experiments on different breast cancer cell lines and it includes protein abundance measurements for about 45 phophoproteins. These phosphoproteins are treated with different inhibitors and stimuli and by comparing their expressions, I will try to infer relations between them.
Before moving on inferring part, I want to have a script that can plot the graphs so that I can see particular results for specific cases.
Blog
Network Inference DREAM Breast Cancer Challenge
The inference of causal edges are described as the change on a node seen after the intervention of another node. If the curves obtained over time overlap (under intervention or no intervention), then there is no relation. Otherwise, we can draw an edge between those nodes and according to the level, up or down, the edge will be activating or inhibiting. These causal edges are context-specific so in different cell line data, we may have different relations.
Blog
HPN-DREAM Breast Cancer Network Inference Challenge
Understanding signaling networks might bring more insights on cancer treatment because cells respond to their environment by activating these networks and phosphorylation reactions play important roles in these networks.
The goal of this challenge is to advance our ability and knowledge on signaling networks inference and protein phosphorylation dynamics prediction. Also, we are asked to develop a visualization method for the data.
The dataset provided is extensive and a result of RPPA (reverse-phase protein array) experiments.
Tag: breast cancer
Blog
Experimental Data Optimization for Network Inference
As I mentioned in my previous post, experimental data from the challenge has missing data values that create problems during analyses. To solve it, first thing I did was to optimize data, which includes detecting missing conditions and putting NAs for data values and sorting them if necessary.
I wrote two functions in the script. First one ranks the data according to the fashion and sorts it based on these ranks.
Blog
Working with Experimental Data from Network Inference Challenge
As I almost finished with in silico data, I moved on to analyses of experimental data using the same script. But since the characteristics of data is somehow different, before inferring network, I need to modify the script to be able to read experimental data files.
These differences include missing data values for some conditions. This makes analyses difficult because I have to estimate a value for them and this will decrease the confidence score of edges.
Blog
Webinar on HPN-DREAM Breast Cancer Network Inference Challenge
DREAM8 organizers plan a webinar about HPN-DREAM Breast Cancer Network Inference Challenge on July 19, at 10:30 - 11:30 (PDT / UTC -7). General setup of the challenge, demo submissions to the leaderboard will be discussed and also questions about the challenge will be accepted during webinar. The number of the participants to the challenge is also announced: 138.
Registration to the webinar is done using this form. There are limited number of “seats”, but later recordings will be published.
Blog
Network Inference Challenge in silico Data
I had a meeting with BiGCaT this week and we discussed DREAM Breast Cancer Challenge. I presented the challenge and also some ways that I have found to solve the first sub-challenge network inference. Tina, from BiGCaT, suggested starting with in silico data which is much simpler than breast cancer data. Later, I can use the methods I develop for in silico data in experimental data.
in silico data contains 20 antibodies, 3 inhibitors and 2 ligand stimuli with 2 different concentration for each.
Blog
DREAM Breast Cancer Sub-challenges
I have been going over the sub-challenges before attempting to solve them. As I mentioned, there are three sub-challenges and somehow they are connected.
First, using given data and other possible data sources such as pathway databases, the causal signaling network of the phosphoproteins. There are 4 cell lines and 8 stimulus so they make total 32 networks at the end. Nodes are phosphoproteins and edges should be directed and causal (activator or inhibitor).
Blog
HPN-DREAM Breast Cancer Network Inference Challenge
Understanding signaling networks might bring more insights on cancer treatment because cells respond to their environment by activating these networks and phosphorylation reactions play important roles in these networks.
The goal of this challenge is to advance our ability and knowledge on signaling networks inference and protein phosphorylation dynamics prediction. Also, we are asked to develop a visualization method for the data.
The dataset provided is extensive and a result of RPPA (reverse-phase protein array) experiments.
Tag: bt20
Blog
Experimental Data Optimization for Network Inference
As I mentioned in my previous post, experimental data from the challenge has missing data values that create problems during analyses. To solve it, first thing I did was to optimize data, which includes detecting missing conditions and putting NAs for data values and sorting them if necessary.
I wrote two functions in the script. First one ranks the data according to the fashion and sorts it based on these ranks.
Tag: bt549
Blog
Experimental Data Optimization for Network Inference
As I mentioned in my previous post, experimental data from the challenge has missing data values that create problems during analyses. To solve it, first thing I did was to optimize data, which includes detecting missing conditions and putting NAs for data values and sorting them if necessary.
I wrote two functions in the script. First one ranks the data according to the fashion and sorts it based on these ranks.
Tag: cell line
Blog
Experimental Data Optimization for Network Inference
As I mentioned in my previous post, experimental data from the challenge has missing data values that create problems during analyses. To solve it, first thing I did was to optimize data, which includes detecting missing conditions and putting NAs for data values and sorting them if necessary.
I wrote two functions in the script. First one ranks the data according to the fashion and sorts it based on these ranks.
Tag: mcf7
Blog
Experimental Data Optimization for Network Inference
As I mentioned in my previous post, experimental data from the challenge has missing data values that create problems during analyses. To solve it, first thing I did was to optimize data, which includes detecting missing conditions and putting NAs for data values and sorting them if necessary.
I wrote two functions in the script. First one ranks the data according to the fashion and sorts it based on these ranks.
Tag: regression line
Blog
Experimental Data Optimization for Network Inference
As I mentioned in my previous post, experimental data from the challenge has missing data values that create problems during analyses. To solve it, first thing I did was to optimize data, which includes detecting missing conditions and putting NAs for data values and sorting them if necessary.
I wrote two functions in the script. First one ranks the data according to the fashion and sorts it based on these ranks.
Tag: uacc812
Blog
Experimental Data Optimization for Network Inference
As I mentioned in my previous post, experimental data from the challenge has missing data values that create problems during analyses. To solve it, first thing I did was to optimize data, which includes detecting missing conditions and putting NAs for data values and sorting them if necessary.
I wrote two functions in the script. First one ranks the data according to the fashion and sorts it based on these ranks.
Tag: check
Blog
Some String Functions in R, String Manipulation in R
I have programmed with Perl, Python, and PHP before, and string manipulation was more direct and easier in them than in R. But still there are useful functions for string manipulation in R. I’m not an expert in R but I’ve been dealing with it for a while and I’ve learned some good functions for this purpose.
Concatenate strings
Concatenation is done with paste function. It gets concatenated strings as arguments separated bu comma and also separator character(s).
Tag: concatenate
Blog
Some String Functions in R, String Manipulation in R
I have programmed with Perl, Python, and PHP before, and string manipulation was more direct and easier in them than in R. But still there are useful functions for string manipulation in R. I’m not an expert in R but I’ve been dealing with it for a while and I’ve learned some good functions for this purpose.
Concatenate strings
Concatenation is done with paste function. It gets concatenated strings as arguments separated bu comma and also separator character(s).
Tag: extract
Blog
Some String Functions in R, String Manipulation in R
I have programmed with Perl, Python, and PHP before, and string manipulation was more direct and easier in them than in R. But still there are useful functions for string manipulation in R. I’m not an expert in R but I’ve been dealing with it for a while and I’ve learned some good functions for this purpose.
Concatenate strings
Concatenation is done with paste function. It gets concatenated strings as arguments separated bu comma and also separator character(s).
Tag: r functions
Blog
Some String Functions in R, String Manipulation in R
I have programmed with Perl, Python, and PHP before, and string manipulation was more direct and easier in them than in R. But still there are useful functions for string manipulation in R. I’m not an expert in R but I’ve been dealing with it for a while and I’ve learned some good functions for this purpose.
Concatenate strings
Concatenation is done with paste function. It gets concatenated strings as arguments separated bu comma and also separator character(s).
Tag: regex
Blog
Some String Functions in R, String Manipulation in R
I have programmed with Perl, Python, and PHP before, and string manipulation was more direct and easier in them than in R. But still there are useful functions for string manipulation in R. I’m not an expert in R but I’ve been dealing with it for a while and I’ve learned some good functions for this purpose.
Concatenate strings
Concatenation is done with paste function. It gets concatenated strings as arguments separated bu comma and also separator character(s).
Tag: replace
Blog
Some String Functions in R, String Manipulation in R
I have programmed with Perl, Python, and PHP before, and string manipulation was more direct and easier in them than in R. But still there are useful functions for string manipulation in R. I’m not an expert in R but I’ve been dealing with it for a while and I’ve learned some good functions for this purpose.
Concatenate strings
Concatenation is done with paste function. It gets concatenated strings as arguments separated bu comma and also separator character(s).
Tag: split
Blog
Some String Functions in R, String Manipulation in R
I have programmed with Perl, Python, and PHP before, and string manipulation was more direct and easier in them than in R. But still there are useful functions for string manipulation in R. I’m not an expert in R but I’ve been dealing with it for a while and I’ve learned some good functions for this purpose.
Concatenate strings
Concatenation is done with paste function. It gets concatenated strings as arguments separated bu comma and also separator character(s).
Tag: string functions
Blog
Some String Functions in R, String Manipulation in R
I have programmed with Perl, Python, and PHP before, and string manipulation was more direct and easier in them than in R. But still there are useful functions for string manipulation in R. I’m not an expert in R but I’ve been dealing with it for a while and I’ve learned some good functions for this purpose.
Concatenate strings
Concatenation is done with paste function. It gets concatenated strings as arguments separated bu comma and also separator character(s).
Tag: area under curve
Blog
Latest Progress on Network Inference and Edge Scoring
I have improved network inference part of the script slightly by changing the way of comparing intervention (presence of inhibitor and stimulus) and no intervention (presence of stimulus) data from in silico part.
Now, I’m using a function (simp) from an R package called StreamMetabolism, which gets time points and data values and (does integration) calculates the area under the curve (Sefick, 2009). I do this integration for both condition and then I compare them.
Tag: causal edge
Blog
Latest Progress on Network Inference and Edge Scoring
I have improved network inference part of the script slightly by changing the way of comparing intervention (presence of inhibitor and stimulus) and no intervention (presence of stimulus) data from in silico part.
Now, I’m using a function (simp) from an R package called StreamMetabolism, which gets time points and data values and (does integration) calculates the area under the curve (Sefick, 2009). I do this integration for both condition and then I compare them.
Blog
Scoring Edges Network Inference HPN-DREAM Challenge
Yesterday, I managed to infer a network for some part of in silico data from the challenge. Since the challenge also asks for scoring the edges in networks, I developed the script further and add a function for that.
edgeScorer function gets data object of average time points for each curve in intervention/no-intervention sets and scores each edge for each set of conditions. For this, first, it looks for the largest difference among the sets and set it as maxDifference and later, it stores differences divided by maxDifference in another data object.
Blog
Determining Edges More Progress on Network Inference
Lately, I have been writing an R script to infer network using in silico data. Last version of the script was reading MIDAS file and plotting expression profiles. I have modified it and now it reads MIDAS file, does some analyses and prints causal relations to a file. This file is a SIF file as required.
This dataset is generated with 20 antibodies but only 3 of them are perturbed. Also, for one, stimulus is missing.
Blog
Plotting Expression Profiles Data Analysis for Network Inference
For in silico data network inference I decided to develop a script because the existing tools have bugs and they are not compatible with the data. At the same time, I will try to report bugs and the compatibility issues to developers.
in silico data has 660 experiment results of 20 antibodies, 4 kinds of stimuli and 3 kinds of inhibitors. Antibodies are treated with a stimulus, say at t_0 and in the case of inhibitors, say at t_i, antibodies are pre-incubated for some time (t_pre) and then, treated with a stimulus.
Tag: integration
Blog
Latest Progress on Network Inference and Edge Scoring
I have improved network inference part of the script slightly by changing the way of comparing intervention (presence of inhibitor and stimulus) and no intervention (presence of stimulus) data from in silico part.
Now, I’m using a function (simp) from an R package called StreamMetabolism, which gets time points and data values and (does integration) calculates the area under the curve (Sefick, 2009). I do this integration for both condition and then I compare them.
Tag: scoring edges
Blog
Latest Progress on Network Inference and Edge Scoring
I have improved network inference part of the script slightly by changing the way of comparing intervention (presence of inhibitor and stimulus) and no intervention (presence of stimulus) data from in silico part.
Now, I’m using a function (simp) from an R package called StreamMetabolism, which gets time points and data values and (does integration) calculates the area under the curve (Sefick, 2009). I do this integration for both condition and then I compare them.
Blog
Scoring Edges Network Inference HPN-DREAM Challenge
Yesterday, I managed to infer a network for some part of in silico data from the challenge. Since the challenge also asks for scoring the edges in networks, I developed the script further and add a function for that.
edgeScorer function gets data object of average time points for each curve in intervention/no-intervention sets and scores each edge for each set of conditions. For this, first, it looks for the largest difference among the sets and set it as maxDifference and later, it stores differences divided by maxDifference in another data object.
Tag: antibody
Blog
Determining Edges More Progress on Network Inference
Lately, I have been writing an R script to infer network using in silico data. Last version of the script was reading MIDAS file and plotting expression profiles. I have modified it and now it reads MIDAS file, does some analyses and prints causal relations to a file. This file is a SIF file as required.
This dataset is generated with 20 antibodies but only 3 of them are perturbed. Also, for one, stimulus is missing.
Tag: excitatory
Blog
Determining Edges More Progress on Network Inference
Lately, I have been writing an R script to infer network using in silico data. Last version of the script was reading MIDAS file and plotting expression profiles. I have modified it and now it reads MIDAS file, does some analyses and prints causal relations to a file. This file is a SIF file as required.
This dataset is generated with 20 antibodies but only 3 of them are perturbed. Also, for one, stimulus is missing.
Tag: inhibitory
Blog
Determining Edges More Progress on Network Inference
Lately, I have been writing an R script to infer network using in silico data. Last version of the script was reading MIDAS file and plotting expression profiles. I have modified it and now it reads MIDAS file, does some analyses and prints causal relations to a file. This file is a SIF file as required.
This dataset is generated with 20 antibodies but only 3 of them are perturbed. Also, for one, stimulus is missing.
Tag: midas
Blog
Determining Edges More Progress on Network Inference
Lately, I have been writing an R script to infer network using in silico data. Last version of the script was reading MIDAS file and plotting expression profiles. I have modified it and now it reads MIDAS file, does some analyses and prints causal relations to a file. This file is a SIF file as required.
This dataset is generated with 20 antibodies but only 3 of them are perturbed. Also, for one, stimulus is missing.
Blog
Plotting Expression Profiles Data Analysis for Network Inference
For in silico data network inference I decided to develop a script because the existing tools have bugs and they are not compatible with the data. At the same time, I will try to report bugs and the compatibility issues to developers.
in silico data has 660 experiment results of 20 antibodies, 4 kinds of stimuli and 3 kinds of inhibitors. Antibodies are treated with a stimulus, say at t_0 and in the case of inhibitors, say at t_i, antibodies are pre-incubated for some time (t_pre) and then, treated with a stimulus.
Blog
Network Inference Challenge in silico Data
I had a meeting with BiGCaT this week and we discussed DREAM Breast Cancer Challenge. I presented the challenge and also some ways that I have found to solve the first sub-challenge network inference. Tina, from BiGCaT, suggested starting with in silico data which is much simpler than breast cancer data. Later, I can use the methods I develop for in silico data in experimental data.
in silico data contains 20 antibodies, 3 inhibitors and 2 ligand stimuli with 2 different concentration for each.
Blog
Playing around with CellNOptR Tool and MIDAS File
With CellNOptR, we will try to construct network models for the challenge. For this, the tool needs two inputs. First one is a special data object called CNOlist that stores vectors and matrices of data. Second one is a .SIF file that contains prior knowledge network which can be obtained from pathway database and analysis tools.
CNOlist contains following fields: namesSignals, namesCues, namesStimuli and namesInhibitors, which are vectors storing the names of measurements.
Tag: webinar
Blog
Webinar on HPN-DREAM Breast Cancer Network Inference Challenge
DREAM8 organizers plan a webinar about HPN-DREAM Breast Cancer Network Inference Challenge on July 19, at 10:30 - 11:30 (PDT / UTC -7). General setup of the challenge, demo submissions to the leaderboard will be discussed and also questions about the challenge will be accepted during webinar. The number of the participants to the challenge is also announced: 138.
Registration to the webinar is done using this form. There are limited number of “seats”, but later recordings will be published.
Tag: bigcat
Blog
Network Inference Challenge in silico Data
I had a meeting with BiGCaT this week and we discussed DREAM Breast Cancer Challenge. I presented the challenge and also some ways that I have found to solve the first sub-challenge network inference. Tina, from BiGCaT, suggested starting with in silico data which is much simpler than breast cancer data. Later, I can use the methods I develop for in silico data in experimental data.
in silico data contains 20 antibodies, 3 inhibitors and 2 ligand stimuli with 2 different concentration for each.
Blog
First Impressions and Thoughts on Rosalind Project
Actually, I signed up Rosalind.info 8 months ago, I didn’t really play around with it. But last week, in a BiGCaT science cafe, after I learnt it, I was more interested than before and I just started solving problems.
In each problem, you have a description about the context and also about the problem. Also, there is a sample input and output. Sometimes there are hints about the solution. What I did was to write a solution that works for the sample and hopefully for the problem.
Blog
Dream Challenge
This year, 8th Dream Challenge takes place and I will be working on this project as my internship job in BiGCaT, Bioinformatics, UM. The challenge brings scientists to catalyze the interaction between experiment and theory in the area of cellular network inference and quantitative model building in systems biology (as said on their webpage).
In this competition, I will work on a specific challenge about network modeling, dynamic response predictions and data visualization.
Tag: cellnoptr
Blog
Network Inference Challenge in silico Data
I had a meeting with BiGCaT this week and we discussed DREAM Breast Cancer Challenge. I presented the challenge and also some ways that I have found to solve the first sub-challenge network inference. Tina, from BiGCaT, suggested starting with in silico data which is much simpler than breast cancer data. Later, I can use the methods I develop for in silico data in experimental data.
in silico data contains 20 antibodies, 3 inhibitors and 2 ligand stimuli with 2 different concentration for each.
Blog
Playing around with CellNOptR Tool and MIDAS File
With CellNOptR, we will try to construct network models for the challenge. For this, the tool needs two inputs. First one is a special data object called CNOlist that stores vectors and matrices of data. Second one is a .SIF file that contains prior knowledge network which can be obtained from pathway database and analysis tools.
CNOlist contains following fields: namesSignals, namesCues, namesStimuli and namesInhibitors, which are vectors storing the names of measurements.
Blog
Progress on Network Inference Sub-Challenge
This sub-challenge has several requirements:
Directed and causal edges on the models (32 models - 4 cell lines × 8 stimuli) Edges should be scored (normalizing to range between 0 and 1) that will show confidence Nodes will be phosphoproteins from the data Prior knowledge network (that can be constructed using pathway databases) is allowed to be used (actually this is a must for some network inference tools) First thing was to look for existing tools.
Blog
Network Inference DREAM Breast Cancer Challenge
The inference of causal edges are described as the change on a node seen after the intervention of another node. If the curves obtained over time overlap (under intervention or no intervention), then there is no relation. Otherwise, we can draw an edge between those nodes and according to the level, up or down, the edge will be activating or inhibiting. These causal edges are context-specific so in different cell line data, we may have different relations.
Tag: cnolist
Blog
Network Inference Challenge in silico Data
I had a meeting with BiGCaT this week and we discussed DREAM Breast Cancer Challenge. I presented the challenge and also some ways that I have found to solve the first sub-challenge network inference. Tina, from BiGCaT, suggested starting with in silico data which is much simpler than breast cancer data. Later, I can use the methods I develop for in silico data in experimental data.
in silico data contains 20 antibodies, 3 inhibitors and 2 ligand stimuli with 2 different concentration for each.
Blog
Playing around with CellNOptR Tool and MIDAS File
With CellNOptR, we will try to construct network models for the challenge. For this, the tool needs two inputs. First one is a special data object called CNOlist that stores vectors and matrices of data. Second one is a .SIF file that contains prior knowledge network which can be obtained from pathway database and analysis tools.
CNOlist contains following fields: namesSignals, namesCues, namesStimuli and namesInhibitors, which are vectors storing the names of measurements.
Tag: cnorfeeder
Blog
Network Inference Challenge in silico Data
I had a meeting with BiGCaT this week and we discussed DREAM Breast Cancer Challenge. I presented the challenge and also some ways that I have found to solve the first sub-challenge network inference. Tina, from BiGCaT, suggested starting with in silico data which is much simpler than breast cancer data. Later, I can use the methods I develop for in silico data in experimental data.
in silico data contains 20 antibodies, 3 inhibitors and 2 ligand stimuli with 2 different concentration for each.
Blog
Progress on Network Inference Sub-Challenge
This sub-challenge has several requirements:
Directed and causal edges on the models (32 models - 4 cell lines × 8 stimuli) Edges should be scored (normalizing to range between 0 and 1) that will show confidence Nodes will be phosphoproteins from the data Prior knowledge network (that can be constructed using pathway databases) is allowed to be used (actually this is a must for some network inference tools) First thing was to look for existing tools.
Tag: ddn
Blog
Network Inference Challenge in silico Data
I had a meeting with BiGCaT this week and we discussed DREAM Breast Cancer Challenge. I presented the challenge and also some ways that I have found to solve the first sub-challenge network inference. Tina, from BiGCaT, suggested starting with in silico data which is much simpler than breast cancer data. Later, I can use the methods I develop for in silico data in experimental data.
in silico data contains 20 antibodies, 3 inhibitors and 2 ligand stimuli with 2 different concentration for each.
Tag: ebi
Blog
Network Inference Challenge in silico Data
I had a meeting with BiGCaT this week and we discussed DREAM Breast Cancer Challenge. I presented the challenge and also some ways that I have found to solve the first sub-challenge network inference. Tina, from BiGCaT, suggested starting with in silico data which is much simpler than breast cancer data. Later, I can use the methods I develop for in silico data in experimental data.
in silico data contains 20 antibodies, 3 inhibitors and 2 ligand stimuli with 2 different concentration for each.
Tag: european bioinformatics institute
Blog
Network Inference Challenge in silico Data
I had a meeting with BiGCaT this week and we discussed DREAM Breast Cancer Challenge. I presented the challenge and also some ways that I have found to solve the first sub-challenge network inference. Tina, from BiGCaT, suggested starting with in silico data which is much simpler than breast cancer data. Later, I can use the methods I develop for in silico data in experimental data.
in silico data contains 20 antibodies, 3 inhibitors and 2 ligand stimuli with 2 different concentration for each.
Tag: makebtables
Blog
Network Inference Challenge in silico Data
I had a meeting with BiGCaT this week and we discussed DREAM Breast Cancer Challenge. I presented the challenge and also some ways that I have found to solve the first sub-challenge network inference. Tina, from BiGCaT, suggested starting with in silico data which is much simpler than breast cancer data. Later, I can use the methods I develop for in silico data in experimental data.
in silico data contains 20 antibodies, 3 inhibitors and 2 ligand stimuli with 2 different concentration for each.
Tag: makecnolist
Blog
Network Inference Challenge in silico Data
I had a meeting with BiGCaT this week and we discussed DREAM Breast Cancer Challenge. I presented the challenge and also some ways that I have found to solve the first sub-challenge network inference. Tina, from BiGCaT, suggested starting with in silico data which is much simpler than breast cancer data. Later, I can use the methods I develop for in silico data in experimental data.
in silico data contains 20 antibodies, 3 inhibitors and 2 ligand stimuli with 2 different concentration for each.
Blog
Playing around with CellNOptR Tool and MIDAS File
With CellNOptR, we will try to construct network models for the challenge. For this, the tool needs two inputs. First one is a special data object called CNOlist that stores vectors and matrices of data. Second one is a .SIF file that contains prior knowledge network which can be obtained from pathway database and analysis tools.
CNOlist contains following fields: namesSignals, namesCues, namesStimuli and namesInhibitors, which are vectors storing the names of measurements.
Tag: model
Blog
Network Inference Challenge in silico Data
I had a meeting with BiGCaT this week and we discussed DREAM Breast Cancer Challenge. I presented the challenge and also some ways that I have found to solve the first sub-challenge network inference. Tina, from BiGCaT, suggested starting with in silico data which is much simpler than breast cancer data. Later, I can use the methods I develop for in silico data in experimental data.
in silico data contains 20 antibodies, 3 inhibitors and 2 ligand stimuli with 2 different concentration for each.
Tag: bioinformatics
Blog
First Impressions and Thoughts on Rosalind Project
Actually, I signed up Rosalind.info 8 months ago, I didn’t really play around with it. But last week, in a BiGCaT science cafe, after I learnt it, I was more interested than before and I just started solving problems.
In each problem, you have a description about the context and also about the problem. Also, there is a sample input and output. Sometimes there are hints about the solution. What I did was to write a solution that works for the sample and hopefully for the problem.
Blog
Using Online Tools for Teaching Bioinformatics
I attended one of science cafe meetings of BiGCaT group today and we discussed use of online tools for teaching bioinformatics.
Andra Waagmeester (PhD student form BiGCaT) introduced Rosalind Project as a teaching tool. This project mainly focuses on bioinformatics solutions. Various questions about bioinformatics are asked on the website. Actually, those are various problems that can be seen in any bioinformatics research and by solving them, it helps you learn bioinformatics.
Blog
HPN-DREAM Breast Cancer Network Inference Challenge
Understanding signaling networks might bring more insights on cancer treatment because cells respond to their environment by activating these networks and phosphorylation reactions play important roles in these networks.
The goal of this challenge is to advance our ability and knowledge on signaling networks inference and protein phosphorylation dynamics prediction. Also, we are asked to develop a visualization method for the data.
The dataset provided is extensive and a result of RPPA (reverse-phase protein array) experiments.
Blog
Dream Challenge
This year, 8th Dream Challenge takes place and I will be working on this project as my internship job in BiGCaT, Bioinformatics, UM. The challenge brings scientists to catalyze the interaction between experiment and theory in the area of cellular network inference and quantitative model building in systems biology (as said on their webpage).
In this competition, I will work on a specific challenge about network modeling, dynamic response predictions and data visualization.
Blog
Biyoinformatik mi? Yoksa Biyoenformatik mi?
Yazılarıma konu ararken kitaplarla birlikte interneti de karıştırıyorum. Yabancı kaynaklar elbette fazlaca var ve yeterliler, ancak Türkçe kaynaklara baktığımda ilk gözüme çarpan bu alanın isminin farklı kullanımları oldu.
Biliyorsunuz, İngilizcede bu alana bioinformatics deniyor. Gayet normal, çünkü İngilizcede informatics ics eki ile birlikte information sözcüğünden geliyor. Bu sözcük ise Latince kökene sahip1. Enformatik sözcüğü Türkçeye, Fransızcadan informatique sözcüğünden, enformatik olarak gelmiş, ayrıca bilişim olarak da Türkçesi önerilmiş2. Elbette bu Fransızca sözcük de İngilizcesi ile aynı kökene sahip.
Tag: bioinformatics stronghold
Blog
First Impressions and Thoughts on Rosalind Project
Actually, I signed up Rosalind.info 8 months ago, I didn’t really play around with it. But last week, in a BiGCaT science cafe, after I learnt it, I was more interested than before and I just started solving problems.
In each problem, you have a description about the context and also about the problem. Also, there is a sample input and output. Sometimes there are hints about the solution. What I did was to write a solution that works for the sample and hopefully for the problem.
Tag: biology
Blog
First Impressions and Thoughts on Rosalind Project
Actually, I signed up Rosalind.info 8 months ago, I didn’t really play around with it. But last week, in a BiGCaT science cafe, after I learnt it, I was more interested than before and I just started solving problems.
In each problem, you have a description about the context and also about the problem. Also, there is a sample input and output. Sometimes there are hints about the solution. What I did was to write a solution that works for the sample and hopefully for the problem.
Blog
Retrieving Data with AJAX using jQuery, PHP and MySQL
Last semester, I took a course from Informatics Institute at METU called “Biological Databases and Data Analysis Tools” where first we learned what is a database and how to do queries on it. Also, the technology behind databases are taught. Then, we learned many biological databases and data analysis tools available. These include gene, protein and pathway databases, tools for creating databases.
As a final project, we were asked to create an online tool that can search a database and get the data and display it on any web browsers.
Tag: computer science
Blog
First Impressions and Thoughts on Rosalind Project
Actually, I signed up Rosalind.info 8 months ago, I didn’t really play around with it. But last week, in a BiGCaT science cafe, after I learnt it, I was more interested than before and I just started solving problems.
In each problem, you have a description about the context and also about the problem. Also, there is a sample input and output. Sometimes there are hints about the solution. What I did was to write a solution that works for the sample and hopefully for the problem.
Tag: genetics
Blog
First Impressions and Thoughts on Rosalind Project
Actually, I signed up Rosalind.info 8 months ago, I didn’t really play around with it. But last week, in a BiGCaT science cafe, after I learnt it, I was more interested than before and I just started solving problems.
In each problem, you have a description about the context and also about the problem. Also, there is a sample input and output. Sometimes there are hints about the solution. What I did was to write a solution that works for the sample and hopefully for the problem.
Tag: python village
Blog
First Impressions and Thoughts on Rosalind Project
Actually, I signed up Rosalind.info 8 months ago, I didn’t really play around with it. But last week, in a BiGCaT science cafe, after I learnt it, I was more interested than before and I just started solving problems.
In each problem, you have a description about the context and also about the problem. Also, there is a sample input and output. Sometimes there are hints about the solution. What I did was to write a solution that works for the sample and hopefully for the problem.
Tag: rosalind
Blog
First Impressions and Thoughts on Rosalind Project
Actually, I signed up Rosalind.info 8 months ago, I didn’t really play around with it. But last week, in a BiGCaT science cafe, after I learnt it, I was more interested than before and I just started solving problems.
In each problem, you have a description about the context and also about the problem. Also, there is a sample input and output. Sometimes there are hints about the solution. What I did was to write a solution that works for the sample and hopefully for the problem.
Blog
Using Online Tools for Teaching Bioinformatics
I attended one of science cafe meetings of BiGCaT group today and we discussed use of online tools for teaching bioinformatics.
Andra Waagmeester (PhD student form BiGCaT) introduced Rosalind Project as a teaching tool. This project mainly focuses on bioinformatics solutions. Various questions about bioinformatics are asked on the website. Actually, those are various problems that can be seen in any bioinformatics research and by solving them, it helps you learn bioinformatics.
Tag: eda
Blog
Progress on Network Inference Sub-Challenge
This sub-challenge has several requirements:
Directed and causal edges on the models (32 models - 4 cell lines × 8 stimuli) Edges should be scored (normalizing to range between 0 and 1) that will show confidence Nodes will be phosphoproteins from the data Prior knowledge network (that can be constructed using pathway databases) is allowed to be used (actually this is a must for some network inference tools) First thing was to look for existing tools.
Blog
Network Inference DREAM Breast Cancer Challenge
The inference of causal edges are described as the change on a node seen after the intervention of another node. If the curves obtained over time overlap (under intervention or no intervention), then there is no relation. Otherwise, we can draw an edge between those nodes and according to the level, up or down, the edge will be activating or inhibiting. These causal edges are context-specific so in different cell line data, we may have different relations.
Tag: kegg
Blog
Progress on Network Inference Sub-Challenge
This sub-challenge has several requirements:
Directed and causal edges on the models (32 models - 4 cell lines × 8 stimuli) Edges should be scored (normalizing to range between 0 and 1) that will show confidence Nodes will be phosphoproteins from the data Prior knowledge network (that can be constructed using pathway databases) is allowed to be used (actually this is a must for some network inference tools) First thing was to look for existing tools.
Tag: microarray
Blog
Progress on Network Inference Sub-Challenge
This sub-challenge has several requirements:
Directed and causal edges on the models (32 models - 4 cell lines × 8 stimuli) Edges should be scored (normalizing to range between 0 and 1) that will show confidence Nodes will be phosphoproteins from the data Prior knowledge network (that can be constructed using pathway databases) is allowed to be used (actually this is a must for some network inference tools) First thing was to look for existing tools.
Tag: pkn
Blog
Progress on Network Inference Sub-Challenge
This sub-challenge has several requirements:
Directed and causal edges on the models (32 models - 4 cell lines × 8 stimuli) Edges should be scored (normalizing to range between 0 and 1) that will show confidence Nodes will be phosphoproteins from the data Prior knowledge network (that can be constructed using pathway databases) is allowed to be used (actually this is a must for some network inference tools) First thing was to look for existing tools.
Tag: wikipathways
Blog
Progress on Network Inference Sub-Challenge
This sub-challenge has several requirements:
Directed and causal edges on the models (32 models - 4 cell lines × 8 stimuli) Edges should be scored (normalizing to range between 0 and 1) that will show confidence Nodes will be phosphoproteins from the data Prior knowledge network (that can be constructed using pathway databases) is allowed to be used (actually this is a must for some network inference tools) First thing was to look for existing tools.
Tag: ajax
Blog
Retrieving Data with AJAX using jQuery, PHP and MySQL
Last semester, I took a course from Informatics Institute at METU called “Biological Databases and Data Analysis Tools” where first we learned what is a database and how to do queries on it. Also, the technology behind databases are taught. Then, we learned many biological databases and data analysis tools available. These include gene, protein and pathway databases, tools for creating databases.
As a final project, we were asked to create an online tool that can search a database and get the data and display it on any web browsers.
Tag: bind
Blog
Retrieving Data with AJAX using jQuery, PHP and MySQL
Last semester, I took a course from Informatics Institute at METU called “Biological Databases and Data Analysis Tools” where first we learned what is a database and how to do queries on it. Also, the technology behind databases are taught. Then, we learned many biological databases and data analysis tools available. These include gene, protein and pathway databases, tools for creating databases.
As a final project, we were asked to create an online tool that can search a database and get the data and display it on any web browsers.
Tag: biological databases
Blog
Retrieving Data with AJAX using jQuery, PHP and MySQL
Last semester, I took a course from Informatics Institute at METU called “Biological Databases and Data Analysis Tools” where first we learned what is a database and how to do queries on it. Also, the technology behind databases are taught. Then, we learned many biological databases and data analysis tools available. These include gene, protein and pathway databases, tools for creating databases.
As a final project, we were asked to create an online tool that can search a database and get the data and display it on any web browsers.
Tag: change
Blog
Retrieving Data with AJAX using jQuery, PHP and MySQL
Last semester, I took a course from Informatics Institute at METU called “Biological Databases and Data Analysis Tools” where first we learned what is a database and how to do queries on it. Also, the technology behind databases are taught. Then, we learned many biological databases and data analysis tools available. These include gene, protein and pathway databases, tools for creating databases.
As a final project, we were asked to create an online tool that can search a database and get the data and display it on any web browsers.
Tag: jquery
Blog
Retrieving Data with AJAX using jQuery, PHP and MySQL
Last semester, I took a course from Informatics Institute at METU called “Biological Databases and Data Analysis Tools” where first we learned what is a database and how to do queries on it. Also, the technology behind databases are taught. Then, we learned many biological databases and data analysis tools available. These include gene, protein and pathway databases, tools for creating databases.
As a final project, we were asked to create an online tool that can search a database and get the data and display it on any web browsers.
Tag: keypress
Blog
Retrieving Data with AJAX using jQuery, PHP and MySQL
Last semester, I took a course from Informatics Institute at METU called “Biological Databases and Data Analysis Tools” where first we learned what is a database and how to do queries on it. Also, the technology behind databases are taught. Then, we learned many biological databases and data analysis tools available. These include gene, protein and pathway databases, tools for creating databases.
As a final project, we were asked to create an online tool that can search a database and get the data and display it on any web browsers.
Tag: pathway
Blog
Retrieving Data with AJAX using jQuery, PHP and MySQL
Last semester, I took a course from Informatics Institute at METU called “Biological Databases and Data Analysis Tools” where first we learned what is a database and how to do queries on it. Also, the technology behind databases are taught. Then, we learned many biological databases and data analysis tools available. These include gene, protein and pathway databases, tools for creating databases.
As a final project, we were asked to create an online tool that can search a database and get the data and display it on any web browsers.
Tag: protein
Blog
Retrieving Data with AJAX using jQuery, PHP and MySQL
Last semester, I took a course from Informatics Institute at METU called “Biological Databases and Data Analysis Tools” where first we learned what is a database and how to do queries on it. Also, the technology behind databases are taught. Then, we learned many biological databases and data analysis tools available. These include gene, protein and pathway databases, tools for creating databases.
As a final project, we were asked to create an online tool that can search a database and get the data and display it on any web browsers.
Tag: code academy
Blog
Using Online Tools for Teaching Bioinformatics
I attended one of science cafe meetings of BiGCaT group today and we discussed use of online tools for teaching bioinformatics.
Andra Waagmeester (PhD student form BiGCaT) introduced Rosalind Project as a teaching tool. This project mainly focuses on bioinformatics solutions. Various questions about bioinformatics are asked on the website. Actually, those are various problems that can be seen in any bioinformatics research and by solving them, it helps you learn bioinformatics.
Tag: online tools
Blog
Using Online Tools for Teaching Bioinformatics
I attended one of science cafe meetings of BiGCaT group today and we discussed use of online tools for teaching bioinformatics.
Andra Waagmeester (PhD student form BiGCaT) introduced Rosalind Project as a teaching tool. This project mainly focuses on bioinformatics solutions. Various questions about bioinformatics are asked on the website. Actually, those are various problems that can be seen in any bioinformatics research and by solving them, it helps you learn bioinformatics.
Tag: teaching
Blog
Using Online Tools for Teaching Bioinformatics
I attended one of science cafe meetings of BiGCaT group today and we discussed use of online tools for teaching bioinformatics.
Andra Waagmeester (PhD student form BiGCaT) introduced Rosalind Project as a teaching tool. This project mainly focuses on bioinformatics solutions. Various questions about bioinformatics are asked on the website. Actually, those are various problems that can be seen in any bioinformatics research and by solving them, it helps you learn bioinformatics.
Tag: teaching bioinformatics
Blog
Using Online Tools for Teaching Bioinformatics
I attended one of science cafe meetings of BiGCaT group today and we discussed use of online tools for teaching bioinformatics.
Andra Waagmeester (PhD student form BiGCaT) introduced Rosalind Project as a teaching tool. This project mainly focuses on bioinformatics solutions. Various questions about bioinformatics are asked on the website. Actually, those are various problems that can be seen in any bioinformatics research and by solving them, it helps you learn bioinformatics.
Tag: cran
Blog
Network Inference DREAM Breast Cancer Challenge
The inference of causal edges are described as the change on a node seen after the intervention of another node. If the curves obtained over time overlap (under intervention or no intervention), then there is no relation. Otherwise, we can draw an edge between those nodes and according to the level, up or down, the edge will be activating or inhibiting. These causal edges are context-specific so in different cell line data, we may have different relations.
Tag: ddepn
Blog
Network Inference DREAM Breast Cancer Challenge
The inference of causal edges are described as the change on a node seen after the intervention of another node. If the curves obtained over time overlap (under intervention or no intervention), then there is no relation. Otherwise, we can draw an edge between those nodes and according to the level, up or down, the edge will be activating or inhibiting. These causal edges are context-specific so in different cell line data, we may have different relations.
Tag: r package
Blog
Network Inference DREAM Breast Cancer Challenge
The inference of causal edges are described as the change on a node seen after the intervention of another node. If the curves obtained over time overlap (under intervention or no intervention), then there is no relation. Otherwise, we can draw an edge between those nodes and according to the level, up or down, the edge will be activating or inhibiting. These causal edges are context-specific so in different cell line data, we may have different relations.
Tag: rppanalyzer
Blog
Network Inference DREAM Breast Cancer Challenge
The inference of causal edges are described as the change on a node seen after the intervention of another node. If the curves obtained over time overlap (under intervention or no intervention), then there is no relation. Otherwise, we can draw an edge between those nodes and according to the level, up or down, the edge will be activating or inhibiting. These causal edges are context-specific so in different cell line data, we may have different relations.
Tag: dmso
Blog
DREAM Breast Cancer Sub-challenges
I have been going over the sub-challenges before attempting to solve them. As I mentioned, there are three sub-challenges and somehow they are connected.
First, using given data and other possible data sources such as pathway databases, the causal signaling network of the phosphoproteins. There are 4 cell lines and 8 stimulus so they make total 32 networks at the end. Nodes are phosphoproteins and edges should be directed and causal (activator or inhibitor).
Blog
HPN-DREAM Breast Cancer Network Inference Challenge
Understanding signaling networks might bring more insights on cancer treatment because cells respond to their environment by activating these networks and phosphorylation reactions play important roles in these networks.
The goal of this challenge is to advance our ability and knowledge on signaling networks inference and protein phosphorylation dynamics prediction. Also, we are asked to develop a visualization method for the data.
The dataset provided is extensive and a result of RPPA (reverse-phase protein array) experiments.
Tag: time-course prediction
Blog
DREAM Breast Cancer Sub-challenges
I have been going over the sub-challenges before attempting to solve them. As I mentioned, there are three sub-challenges and somehow they are connected.
First, using given data and other possible data sources such as pathway databases, the causal signaling network of the phosphoproteins. There are 4 cell lines and 8 stimulus so they make total 32 networks at the end. Nodes are phosphoproteins and edges should be directed and causal (activator or inhibitor).
Tag: lysates
Blog
HPN-DREAM Breast Cancer Network Inference Challenge
Understanding signaling networks might bring more insights on cancer treatment because cells respond to their environment by activating these networks and phosphorylation reactions play important roles in these networks.
The goal of this challenge is to advance our ability and knowledge on signaling networks inference and protein phosphorylation dynamics prediction. Also, we are asked to develop a visualization method for the data.
The dataset provided is extensive and a result of RPPA (reverse-phase protein array) experiments.
Tag: phosphorylation
Blog
HPN-DREAM Breast Cancer Network Inference Challenge
Understanding signaling networks might bring more insights on cancer treatment because cells respond to their environment by activating these networks and phosphorylation reactions play important roles in these networks.
The goal of this challenge is to advance our ability and knowledge on signaling networks inference and protein phosphorylation dynamics prediction. Also, we are asked to develop a visualization method for the data.
The dataset provided is extensive and a result of RPPA (reverse-phase protein array) experiments.
Tag: proteomics
Blog
HPN-DREAM Breast Cancer Network Inference Challenge
Understanding signaling networks might bring more insights on cancer treatment because cells respond to their environment by activating these networks and phosphorylation reactions play important roles in these networks.
The goal of this challenge is to advance our ability and knowledge on signaling networks inference and protein phosphorylation dynamics prediction. Also, we are asked to develop a visualization method for the data.
The dataset provided is extensive and a result of RPPA (reverse-phase protein array) experiments.
Tag: reverse-phase protein array
Blog
HPN-DREAM Breast Cancer Network Inference Challenge
Understanding signaling networks might bring more insights on cancer treatment because cells respond to their environment by activating these networks and phosphorylation reactions play important roles in these networks.
The goal of this challenge is to advance our ability and knowledge on signaling networks inference and protein phosphorylation dynamics prediction. Also, we are asked to develop a visualization method for the data.
The dataset provided is extensive and a result of RPPA (reverse-phase protein array) experiments.
Tag: sage
Blog
Dream Challenge
This year, 8th Dream Challenge takes place and I will be working on this project as my internship job in BiGCaT, Bioinformatics, UM. The challenge brings scientists to catalyze the interaction between experiment and theory in the area of cellular network inference and quantitative model building in systems biology (as said on their webpage).
In this competition, I will work on a specific challenge about network modeling, dynamic response predictions and data visualization.
Tag: synapse
Blog
Dream Challenge
This year, 8th Dream Challenge takes place and I will be working on this project as my internship job in BiGCaT, Bioinformatics, UM. The challenge brings scientists to catalyze the interaction between experiment and theory in the area of cellular network inference and quantitative model building in systems biology (as said on their webpage).
In this competition, I will work on a specific challenge about network modeling, dynamic response predictions and data visualization.
Tag: array
Blog
SRS'de Coklu Arama Yapmak
Inceleme yapan scriptin en son hali, oncekilere gore daha fazla okuma inceliyor oldugu icin her okuma icin SRS uzerinde isim aramak oldukca zaman alan bir islemdi. Oyle ki, son inceleme 4 gun surdu.
Bunu azaltmak icin inceleme scriptini tamamen degistirdim. Oncelikle her zaman oldugu gibi esik degerini gecenleri aliyor ama direkt bunlarin ID numaralarini bir dizide (array) listeliyorum. Daha sonra bu listenin herbir elemanini boru karakteri ile ayirarak bir string haline getiriyorum.
Tag: getz
Blog
SRS'de Coklu Arama Yapmak
Inceleme yapan scriptin en son hali, oncekilere gore daha fazla okuma inceliyor oldugu icin her okuma icin SRS uzerinde isim aramak oldukca zaman alan bir islemdi. Oyle ki, son inceleme 4 gun surdu.
Bunu azaltmak icin inceleme scriptini tamamen degistirdim. Oncelikle her zaman oldugu gibi esik degerini gecenleri aliyor ama direkt bunlarin ID numaralarini bir dizide (array) listeliyorum. Daha sonra bu listenin herbir elemanini boru karakteri ile ayirarak bir string haline getiriyorum.
Blog
Unix'te Perl Ile Bir Komut Ciktisini Okumak ve Duzenli Ifadeler
Daha once organizma isimlerini duzenli ifadelerle nasil cikardigimi anlatmistim. Burada, gene benzer bir seyden bahsedecegim ancak bu biraz daha fazla, ozel bir teknikle Perl’de yapilan, veri tabanindan bilgileri birden fazla satir halinde cikti olarak aldigim icin gerek duydugum cok yararli bir yontem. Mutlaka benzerini baska amaclarla da kullanabilir, yararlanabilirsiniz.
Bu ihtiyac, HUSAR gurubu tarafindan olusturulan honest veritabaninin organizma isimlerini direkt sunmamasi ancak birkac satir halinde gostermesi sebebiyle dogdu. Asagida bunun ornegini gorebilirsiniz.
Blog
Duzenli Ifadeler ile Tur Ismini Elde Etmek
Projemin sonunda kullaniciya olasi kirleten organizmalarin adlarini (Latince tur isimleri) gosterecegim icin, MegaBLAST sonuclarindaki erisim numaralarini (accession number) kullanarak her dizi icin organizma adlarini elde etmem gerekiyor. Sequence Retrival System (SRS) adinda, HUSAR sunucularinda bulunan baska bir sistem ile bunu yapabiliyorum.
SRS’ten organizma adini ogrenebilmem icin Unix komut satirinda “getz” komutuyla birlikte veritabani ismi, erisim numarasi ve ogrenmek istedigim alani yazmam yetiyor. Asagida, bu isi yapabilen ornek bir kod bulabilirsiniz.
Tag: hash
Blog
SRS'de Coklu Arama Yapmak
Inceleme yapan scriptin en son hali, oncekilere gore daha fazla okuma inceliyor oldugu icin her okuma icin SRS uzerinde isim aramak oldukca zaman alan bir islemdi. Oyle ki, son inceleme 4 gun surdu.
Bunu azaltmak icin inceleme scriptini tamamen degistirdim. Oncelikle her zaman oldugu gibi esik degerini gecenleri aliyor ama direkt bunlarin ID numaralarini bir dizide (array) listeliyorum. Daha sonra bu listenin herbir elemanini boru karakteri ile ayirarak bir string haline getiriyorum.
Blog
MegaBLAST Sonuclarini Incelemek - Parsing
Pipeline’da son asama, aranan dizilerin urettigi ciktilari baska bir script ile incelemek. Bu islemle herbir megablast dosyasi okunuyor, ve dizilerin name, identity, overlapping length gibi parametrelerinin degerleri saklanarak amaca yonelik sekilde ekrana yazdiriliyor.
Projemde HUSAR paketinde bulunan ve yukarida bahsettigim alanlari bana dizi olarak donduren Inslink adinda bir parser kullaniyorum. Bu parserin yaptigi tek sey, dosyayi okumak ve dosyadaki istenen alanlarin degerlerini saklamak.
Daha sonra ben bu saklanan degerleri, koda eklemeler yaparak gosteriyorum ve birkac ek kod ile de ihtiyacim olan anlamli sonuclar gosteriyorum.
Blog
Inceleme Sonuclarini "Ambiguous" Olarak Ayirmak
Cesitli veritabanlarina karsi yaptigim aramalardan aldigim sonuclari incelerken, bunlari cesitli esik degerleri ile degerlendirmek ile beraber belirlenen esik degerlerinin uzerinde ya da altinda olan hitleri “Ambiguous” (belirsiz, cok anlamli) ya da “Unique” (essiz, tek) olarak ayirarak daha da anlamli hale getirmeye calisiyorum.
“Ambiguous” olarak, her bir megablast dosyasinda esik degerlerine uygun ancak birden fazla farkli organizmayi iceren hitleri etiketliyorum. Eger her esik degerine uygun hit, tek bir dosya icinde her zaman ayni organizmaya ait ise bu durumda yaptigim sey onu “unique” olarak etiketlemek.
Tag: megablast
Blog
SRS'de Coklu Arama Yapmak
Inceleme yapan scriptin en son hali, oncekilere gore daha fazla okuma inceliyor oldugu icin her okuma icin SRS uzerinde isim aramak oldukca zaman alan bir islemdi. Oyle ki, son inceleme 4 gun surdu.
Bunu azaltmak icin inceleme scriptini tamamen degistirdim. Oncelikle her zaman oldugu gibi esik degerini gecenleri aliyor ama direkt bunlarin ID numaralarini bir dizide (array) listeliyorum. Daha sonra bu listenin herbir elemanini boru karakteri ile ayirarak bir string haline getiriyorum.
Blog
MegaBLAST Sonuclarini Incelemek - Parsing
Pipeline’da son asama, aranan dizilerin urettigi ciktilari baska bir script ile incelemek. Bu islemle herbir megablast dosyasi okunuyor, ve dizilerin name, identity, overlapping length gibi parametrelerinin degerleri saklanarak amaca yonelik sekilde ekrana yazdiriliyor.
Projemde HUSAR paketinde bulunan ve yukarida bahsettigim alanlari bana dizi olarak donduren Inslink adinda bir parser kullaniyorum. Bu parserin yaptigi tek sey, dosyayi okumak ve dosyadaki istenen alanlarin degerlerini saklamak.
Daha sonra ben bu saklanan degerleri, koda eklemeler yaparak gosteriyorum ve birkac ek kod ile de ihtiyacim olan anlamli sonuclar gosteriyorum.
Blog
Kalite Satirinin Degerlendirilmesi - Quality Filter
Kirleten organizma (konaminant) analizi yapacak olan pipeline’i daha fazla gelistirmek, daha anlamli sonuclar elde etmek icin ilk adimlara (henuz fastq dosyasini isliyorken) kalite filtresi eklemeyi dusunduk. Boylece belirli bir esik degerinden dusuk okumalari daha o asamadan filtreleyerek daha guvenilir sonuclar elde elebilecegiz.
Bu kalite kontrolunu fastq dosyasinda her okumanin 4. satirini anlayarak yapacagiz. Bu 4. satir (aslinda okumanin dizileme kalite skoru), cesitli dizileme cihazlari tarafindan cesitli sekillerde yaziliyor (kodlaniyor) ve bu kodlamadan tekrar kalite skorunu elde ederek filtreleme uygulanmasi gerekiyor.
Blog
Inceleme Sonuclarini "Ambiguous" Olarak Ayirmak
Cesitli veritabanlarina karsi yaptigim aramalardan aldigim sonuclari incelerken, bunlari cesitli esik degerleri ile degerlendirmek ile beraber belirlenen esik degerlerinin uzerinde ya da altinda olan hitleri “Ambiguous” (belirsiz, cok anlamli) ya da “Unique” (essiz, tek) olarak ayirarak daha da anlamli hale getirmeye calisiyorum.
“Ambiguous” olarak, her bir megablast dosyasinda esik degerlerine uygun ancak birden fazla farkli organizmayi iceren hitleri etiketliyorum. Eger her esik degerine uygun hit, tek bir dosya icinde her zaman ayni organizmaya ait ise bu durumda yaptigim sey onu “unique” olarak etiketlemek.
Blog
Birden Fazla Dizi Dosyalarindan MegaBLAST'i Calistirmak
Asagidaki scripti, pipeline’in MegaBLAST aramasini daha hizli yapabilmek icin dusundugumuz bir teknige uygun olabilmesi icin yazdim. Yaptigi sey, her okuma icin olusturulmus ve formatlanmis dizi dosyalarini kullanarak veritabanlarinda belirtilen baslangic noktasi ve okuma sayisi ile arama yapmak.
1#!user/local/bin/perl 2 3$database = $ARGV[0]; 4$dir = $ARGV[1]; #directory for sequences 5$sp = $ARGV[2]; #starting point 6$n = $ARGV[3] + $sp; 7 8while (1) { 9 system("blastplus -programname=megablast $dir/read_$sp.seq $database -OUTFILE=read_$sp.megablast -nobatch -d"); 10 $sp++; 11 last if ($sp == $n); 12} Burada her sey gercekten cok basit bir programlama ile isliyor.
Blog
Tek FASTA Dosyasindan MegaBLAST'i Calistirmak - Duzenli Ifadeler
Asagida MegaBLAST’i FASTA dosyasi okuyarak calistirmak ve sonuclari bir dizinde toplayabilmek amaciyla yazdigim Perl scripti ve onun aciklamasi var. Bu script tasarlamakta oldugum pipeline’in onemli bir parcasi. Bu script ilk yazdigim olan ve sadece bir FASTA dosyasi uzerinden tum okumalara ulasabilen script.
1#!user/local/bin/perl 2$database = $ARGV[0]; 3$fasta = $ARGV[1]; #input file 4$sp = $ARGV[2]; #starting point 5$n = $ARGV[3] + $sp; 6 7if(!defined($n)){$n=12;} #set default number 8 9open FASTA, $fasta or die $!
Blog
Unix'te Perl Ile Bir Komut Ciktisini Okumak ve Duzenli Ifadeler
Daha once organizma isimlerini duzenli ifadelerle nasil cikardigimi anlatmistim. Burada, gene benzer bir seyden bahsedecegim ancak bu biraz daha fazla, ozel bir teknikle Perl’de yapilan, veri tabanindan bilgileri birden fazla satir halinde cikti olarak aldigim icin gerek duydugum cok yararli bir yontem. Mutlaka benzerini baska amaclarla da kullanabilir, yararlanabilirsiniz.
Bu ihtiyac, HUSAR gurubu tarafindan olusturulan honest veritabaninin organizma isimlerini direkt sunmamasi ancak birkac satir halinde gostermesi sebebiyle dogdu. Asagida bunun ornegini gorebilirsiniz.
Blog
Duzenli Ifadeler ile Tur Ismini Elde Etmek
Projemin sonunda kullaniciya olasi kirleten organizmalarin adlarini (Latince tur isimleri) gosterecegim icin, MegaBLAST sonuclarindaki erisim numaralarini (accession number) kullanarak her dizi icin organizma adlarini elde etmem gerekiyor. Sequence Retrival System (SRS) adinda, HUSAR sunucularinda bulunan baska bir sistem ile bunu yapabiliyorum.
SRS’ten organizma adini ogrenebilmem icin Unix komut satirinda “getz” komutuyla birlikte veritabani ismi, erisim numarasi ve ogrenmek istedigim alani yazmam yetiyor. Asagida, bu isi yapabilen ornek bir kod bulabilirsiniz.
Blog
Bir MegaBLAST Ciktisi Icerigi - RefSeq Veritabani
Asagida, deneme FASTA dosyasini refseq_genomic veritabaninda arayarak elde ettigim dosyadan, bir hitin ayrintilarini goruyoruz.
>>>>refseq_genomic_complete3: AC_000033_0310 Continuation (311 of 1357) of AC_000033 from base 31000001 (AC_000033 Mus musculus strain mixed chromosome 11, alternate assembly Mm_Celera, whole genome shotgun sequence. 2/2012) Length = 110000 Score = 115 bits (58), Expect = 4e-22 Identities = 74/79 (93%), Gaps = 2/79 (2%) Strand = Plus / Minus Query: 1 ctctctctgtct-tctctctctctctgtctctctctctttctctctcttctctctctctc 59 |||||||||||| ||| ||||||||| ||||||||||| ||||||||||||||||||||| Sbjct: 89773 ctctctctgtctgtctttctctctctctctctctctctctctctctcttctctctctctc 89714 Query: 60 tttctctctgccctctctc 78 ||||||||| ||||||||| Sbjct: 89713 tttctctct-ccctctctc 89696 Ayrintilarda, ilk olarak >>>> karakterleriyle hit ile ilgili baslik bilgisi veriyor.
Blog
MegaBLAST Aramasini Hizlandirma
Son zamanlarda sadece farkli veritabanlarinda, MegaBLAST’i en cabuk ve etkili bir sekilde calistirmanin yolunu ariyorum ve FASTA dosyasi olusturma asamasinda, gercekten cokca ise yarayan bir yontem danismanim tarafindan geldi.
Daha once tum dizilerin bulundugu tek bir FASTA dosyasindan arama yapiyordum ve bu zaman kaybina yol aciyordu. Her ne kadar dosya bir sefer acilsa da her seferinde dosya icinde satirlara gidip onu okuman, zaman alan bir islem. Bunu, dosyadaki her okumayi, ayri bir FASTA dosyasi haline getirerek cozduk.
Blog
Veritabanina Gore Bir Komutun Calisma Suresi - CPU Runtime
Calisilan dosyalar, veritabanları buyuk olunca ve yeterince bilgisayar gucune sahip olmayınca, her seyden once olcmemiz gereken nasil en etkili ve kisa surede sonucu alabiliyor olmamizdir.
Özellikle projemde, farkli veritabanları ve farkli parametreler kullanarak, bunları arastiriyorum.
Şimdilik dort veritabani deniyorum, bunlar: nrnuc, ensembl_cdna, honest ve refseq_genomic. Ayrica, bunu farkli iki kelime uzunluğuna gore de yapacagim. Kelime uzunluğu (word size) MegaBLAST’in ararken tam olarak eslestirecegi baz cifti sayisi. Yani elimde 151 baz ciftine sahip bir dizilim varsa, ve eger kelime uzunluğu 50 olarak belirlenmişse, bu 151 baz cifti icinden herhangi bir yerden baslayan ama arka arkaya en az 50 bazin dizilendiği kisimlar aranacak.
Blog
FASTQ'dan FASTA'ya Donusturme Perl Scripti
FASTQ ve FASTA formatlari aslinda ayni bilgiyi iceren ancak birinde sadece herbir dizi icin iki satir daha az bilginin bulundugu dosya formatlari. Projemde onemli olan diger bir farklari ise FASTA formatinin direkt olarak MegaBLAST arama yapilabilmesi. Iste bu yuzden, genetik dizilim yapan makinelerin olusturdugu FASTQ formatini FASTA’ya cevirmem gerekiyor. Ve bu script pipeline’in ilk adimi.
Aslinda deneme amacli aldigim genetk dizilimin, bana bunu ulastiran tarafindan eslestirmesinin yapilmadigi icin, bir on adim olarak bu eslestirmeyi yapmistim.
Blog
SAM Dosyası - BAM Dosyası - samtools
Aslında programlamam gereken pipeline direkt olarak eşleşmeyen okumalar üzerinden analizler yapacak. Ancak böyle bir veri bulamadiığım için, elimdeki tek veri eşleşen ve eşleşmeyen okumaları içerdiği için önce eşleşenlerden kurtulmam gerekti.
Bunu daha önce de belirttiğim gibi bwa eşleştiricisi (aligner - mapper) ile yapıyorum. bwa bir dizi işlemden sonra SAM dosyası oluşturuyor ancak benim FASTQ dosyasına ihtiyacım var. Bunun için SAM dosyasını samtools1 ile benzer bir format olan BAM dosyasına çevirip, daha sonra da bam2fastq2 aracı ile FASTQ dosyamı elde edeceğim.
Blog
MegaBLAST - Dizilerdeki Benzerlikleri Bulma Aracı
MegaBLAST, HUSAR paketinde bulunan, BLAST (Basic Local Alignment Search Tool) paketinin bir parçası. Ayrıca BLASTN’in bir değişik türü. MegaBLAST uzun dizileri BLASTN’den daha etkili bir şekilde işliyor ve hem de çok daha hızlı işlem yapiyor ancak daha az duyarlı. Bu yüzden benzer dizileri geniş veri tabanlarında aramaya çok uygun bir araç.
Yazacağım program çoklu dizilim barındıran FASTA dosyasını alacak ve megablast komutunu çalıştıracak. Daha sonra da her okuma için bir .
Blog
Kontaminant (Kirletici) Analizi Projesi
Başlangıç olarak, araçlara, programlama diline, kısacası biyoenformatiğe alışabilmem için bana verilen bu ufak projeyi ayrıntılı olarak anlatacağım.
Biliyoruz ki, laboratuvar çalışmalarımızda ne kadar önlemeye çalışsak da kontaminant riski hep bulunuyor. Bunu ne kadar aza indirsek o kadar iyi, ki daha sonra bunun miktarını bulup, bunun üzerinden sonucumuzun bir başka değerlendirmesini de yapabiliriz. İşte bunu bulmak için bir yöntem, DNA analizi. Çalıştığınız örneğinizin DNA’sı dizileniyor ve bu DNA çeşitli programlarla analiz edilip, kirleten organizmaları DNA’larından ortaya çıkarabiliyoruz
Blog
FASTQ Formatı - FASTQ Dosyası
Bugün programı oluştururken kullanacağım “test” dizilimini aldım. İki adet FASTQ dosyasından oluşuyor, her biri sıkıştırılmış ama buna rağmen boyutları 6 GB civarı. Ben elbette çok zaman kaybetmek istemediğim için bu dosyalardan birinin sadece bir kısmını kullanacağım.
Amacım, bu FASTQ dosyalarındaki eşleşebilen okumaları BWA aracı ile bularak, daha sonra onları çıkarmak. Ve kalan eşleşemeyen okumaları MegaBLAST aracının anlayabileceği bir dilde (FASTA formatında) kaydetmek.
Bu arada tüm projeyi bir Unix bilgisayarda hazırladığım için birçok komut öğreniyorum, daha sonra bunları ayrıca yazmaya çalışacağım.
Blog
Dizileme Çalışmalarını Kirleten Organizmaları Tespit Etme
Bu yaz stajımda ilk olarak başlayacağım çalışma yavaş yavaş şekilleniyor. Bu çalışmada bir pipeline oluşturup, bunu laboratuvarlarda dizileme (sequencing) örneklerini kirleten organizmaları bulmaya çalışacağım.
Laboratuvarlarda birçok nedenden dolayı örnekler başka organizmalar ya da yabancı DNA tarafından kirlenebiliyor. Bunlar bakteri, maya olabilir ya da bir virüs DNA’sı da olabilir. Siz bir DNA’yı diziledikten sonra onun referansıyla eşleştirme çok az oranda çıkabiliyor. Bu da yabancı DNA’nın olabileceğini gösteriyor. Bir başka neden referans DNA’nın farklı olması da olabilir.
Tag: organizma
Blog
SRS'de Coklu Arama Yapmak
Inceleme yapan scriptin en son hali, oncekilere gore daha fazla okuma inceliyor oldugu icin her okuma icin SRS uzerinde isim aramak oldukca zaman alan bir islemdi. Oyle ki, son inceleme 4 gun surdu.
Bunu azaltmak icin inceleme scriptini tamamen degistirdim. Oncelikle her zaman oldugu gibi esik degerini gecenleri aliyor ama direkt bunlarin ID numaralarini bir dizide (array) listeliyorum. Daha sonra bu listenin herbir elemanini boru karakteri ile ayirarak bir string haline getiriyorum.
Tag: srs
Blog
SRS'de Coklu Arama Yapmak
Inceleme yapan scriptin en son hali, oncekilere gore daha fazla okuma inceliyor oldugu icin her okuma icin SRS uzerinde isim aramak oldukca zaman alan bir islemdi. Oyle ki, son inceleme 4 gun surdu.
Bunu azaltmak icin inceleme scriptini tamamen degistirdim. Oncelikle her zaman oldugu gibi esik degerini gecenleri aliyor ama direkt bunlarin ID numaralarini bir dizide (array) listeliyorum. Daha sonra bu listenin herbir elemanini boru karakteri ile ayirarak bir string haline getiriyorum.
Blog
Unix'te Perl Ile Bir Komut Ciktisini Okumak ve Duzenli Ifadeler
Daha once organizma isimlerini duzenli ifadelerle nasil cikardigimi anlatmistim. Burada, gene benzer bir seyden bahsedecegim ancak bu biraz daha fazla, ozel bir teknikle Perl’de yapilan, veri tabanindan bilgileri birden fazla satir halinde cikti olarak aldigim icin gerek duydugum cok yararli bir yontem. Mutlaka benzerini baska amaclarla da kullanabilir, yararlanabilirsiniz.
Bu ihtiyac, HUSAR gurubu tarafindan olusturulan honest veritabaninin organizma isimlerini direkt sunmamasi ancak birkac satir halinde gostermesi sebebiyle dogdu. Asagida bunun ornegini gorebilirsiniz.
Blog
Duzenli Ifadeler ile Tur Ismini Elde Etmek
Projemin sonunda kullaniciya olasi kirleten organizmalarin adlarini (Latince tur isimleri) gosterecegim icin, MegaBLAST sonuclarindaki erisim numaralarini (accession number) kullanarak her dizi icin organizma adlarini elde etmem gerekiyor. Sequence Retrival System (SRS) adinda, HUSAR sunucularinda bulunan baska bir sistem ile bunu yapabiliyorum.
SRS’ten organizma adini ogrenebilmem icin Unix komut satirinda “getz” komutuyla birlikte veritabani ismi, erisim numarasi ve ogrenmek istedigim alani yazmam yetiyor. Asagida, bu isi yapabilen ornek bir kod bulabilirsiniz.
Blog
DKFZ - Heidelberg Biyoenformatik Birimi'nde Staj
Erasmus programıyla yapıyor olduğum yaz stajı başladı. İlk olarak birimi yöneten bilim insanlarından birkaç saatlik tanıtım dersi aldım. Bu derste birimin kısa tarihi, birimin günümüze kadar yaptıkları projeler ve bunlarin ayrintilari konusunda bilgiler aldım.
Biyoenformatik Birimi DKFZ’nin (Deutsches Krebsforschungszentrum – ing. German Cancer Research Center) bir çekirdek tesisi olan Genomik ve Proteomik Çekirdek Tesisi’ne bağlı bir grup. İsimleri aynı zamanda HUSAR (Heidelberg Unix Sequence Analysis Resources) ve bu isim grubun geliştirdiği dizi analizi yapma paketinin de adı olarak kullanılıyor.
Tag: emacs
Blog
MegaBLAST Sonuclarini Incelemek - Parsing
Pipeline’da son asama, aranan dizilerin urettigi ciktilari baska bir script ile incelemek. Bu islemle herbir megablast dosyasi okunuyor, ve dizilerin name, identity, overlapping length gibi parametrelerinin degerleri saklanarak amaca yonelik sekilde ekrana yazdiriliyor.
Projemde HUSAR paketinde bulunan ve yukarida bahsettigim alanlari bana dizi olarak donduren Inslink adinda bir parser kullaniyorum. Bu parserin yaptigi tek sey, dosyayi okumak ve dosyadaki istenen alanlarin degerlerini saklamak.
Daha sonra ben bu saklanan degerleri, koda eklemeler yaparak gosteriyorum ve birkac ek kod ile de ihtiyacim olan anlamli sonuclar gosteriyorum.
Blog
FASTQ Formatı - FASTQ Dosyası
Bugün programı oluştururken kullanacağım “test” dizilimini aldım. İki adet FASTQ dosyasından oluşuyor, her biri sıkıştırılmış ama buna rağmen boyutları 6 GB civarı. Ben elbette çok zaman kaybetmek istemediğim için bu dosyalardan birinin sadece bir kısmını kullanacağım.
Amacım, bu FASTQ dosyalarındaki eşleşebilen okumaları BWA aracı ile bularak, daha sonra onları çıkarmak. Ve kalan eşleşemeyen okumaları MegaBLAST aracının anlayabileceği bir dilde (FASTA formatında) kaydetmek.
Bu arada tüm projeyi bir Unix bilgisayarda hazırladığım için birçok komut öğreniyorum, daha sonra bunları ayrıca yazmaya çalışacağım.
Tag: parsing
Blog
MegaBLAST Sonuclarini Incelemek - Parsing
Pipeline’da son asama, aranan dizilerin urettigi ciktilari baska bir script ile incelemek. Bu islemle herbir megablast dosyasi okunuyor, ve dizilerin name, identity, overlapping length gibi parametrelerinin degerleri saklanarak amaca yonelik sekilde ekrana yazdiriliyor.
Projemde HUSAR paketinde bulunan ve yukarida bahsettigim alanlari bana dizi olarak donduren Inslink adinda bir parser kullaniyorum. Bu parserin yaptigi tek sey, dosyayi okumak ve dosyadaki istenen alanlarin degerlerini saklamak.
Daha sonra ben bu saklanan degerleri, koda eklemeler yaparak gosteriyorum ve birkac ek kod ile de ihtiyacim olan anlamli sonuclar gosteriyorum.
Tag: unix
Blog
MegaBLAST Sonuclarini Incelemek - Parsing
Pipeline’da son asama, aranan dizilerin urettigi ciktilari baska bir script ile incelemek. Bu islemle herbir megablast dosyasi okunuyor, ve dizilerin name, identity, overlapping length gibi parametrelerinin degerleri saklanarak amaca yonelik sekilde ekrana yazdiriliyor.
Projemde HUSAR paketinde bulunan ve yukarida bahsettigim alanlari bana dizi olarak donduren Inslink adinda bir parser kullaniyorum. Bu parserin yaptigi tek sey, dosyayi okumak ve dosyadaki istenen alanlarin degerlerini saklamak.
Daha sonra ben bu saklanan degerleri, koda eklemeler yaparak gosteriyorum ve birkac ek kod ile de ihtiyacim olan anlamli sonuclar gosteriyorum.
Blog
Unix'te Perl Ile Bir Komut Ciktisini Okumak ve Duzenli Ifadeler
Daha once organizma isimlerini duzenli ifadelerle nasil cikardigimi anlatmistim. Burada, gene benzer bir seyden bahsedecegim ancak bu biraz daha fazla, ozel bir teknikle Perl’de yapilan, veri tabanindan bilgileri birden fazla satir halinde cikti olarak aldigim icin gerek duydugum cok yararli bir yontem. Mutlaka benzerini baska amaclarla da kullanabilir, yararlanabilirsiniz.
Bu ihtiyac, HUSAR gurubu tarafindan olusturulan honest veritabaninin organizma isimlerini direkt sunmamasi ancak birkac satir halinde gostermesi sebebiyle dogdu. Asagida bunun ornegini gorebilirsiniz.
Blog
Duzenli Ifadeler ile Tur Ismini Elde Etmek
Projemin sonunda kullaniciya olasi kirleten organizmalarin adlarini (Latince tur isimleri) gosterecegim icin, MegaBLAST sonuclarindaki erisim numaralarini (accession number) kullanarak her dizi icin organizma adlarini elde etmem gerekiyor. Sequence Retrival System (SRS) adinda, HUSAR sunucularinda bulunan baska bir sistem ile bunu yapabiliyorum.
SRS’ten organizma adini ogrenebilmem icin Unix komut satirinda “getz” komutuyla birlikte veritabani ismi, erisim numarasi ve ogrenmek istedigim alani yazmam yetiyor. Asagida, bu isi yapabilen ornek bir kod bulabilirsiniz.
Blog
FASTQ Formatı - FASTQ Dosyası
Bugün programı oluştururken kullanacağım “test” dizilimini aldım. İki adet FASTQ dosyasından oluşuyor, her biri sıkıştırılmış ama buna rağmen boyutları 6 GB civarı. Ben elbette çok zaman kaybetmek istemediğim için bu dosyalardan birinin sadece bir kısmını kullanacağım.
Amacım, bu FASTQ dosyalarındaki eşleşebilen okumaları BWA aracı ile bularak, daha sonra onları çıkarmak. Ve kalan eşleşemeyen okumaları MegaBLAST aracının anlayabileceği bir dilde (FASTA formatında) kaydetmek.
Bu arada tüm projeyi bir Unix bilgisayarda hazırladığım için birçok komut öğreniyorum, daha sonra bunları ayrıca yazmaya çalışacağım.
Tag: fastq
Blog
Kalite Satirinin Degerlendirilmesi - Quality Filter
Kirleten organizma (konaminant) analizi yapacak olan pipeline’i daha fazla gelistirmek, daha anlamli sonuclar elde etmek icin ilk adimlara (henuz fastq dosyasini isliyorken) kalite filtresi eklemeyi dusunduk. Boylece belirli bir esik degerinden dusuk okumalari daha o asamadan filtreleyerek daha guvenilir sonuclar elde elebilecegiz.
Bu kalite kontrolunu fastq dosyasinda her okumanin 4. satirini anlayarak yapacagiz. Bu 4. satir (aslinda okumanin dizileme kalite skoru), cesitli dizileme cihazlari tarafindan cesitli sekillerde yaziliyor (kodlaniyor) ve bu kodlamadan tekrar kalite skorunu elde ederek filtreleme uygulanmasi gerekiyor.
Blog
Dorduncu Deneme Veriseti: Mus Musculus Genomu
Simdiye kadar ilk uc veriseti de insan genomuna aitti. Pipeline’i bu genomlarla deneyip, yer yer iyilestirmeler yaptim. Simdi ise baska organizmalarla da deneyip, daha fazla sonuc alip bunlari inceleyecegim ve gene gerekli iyilestirmeleri yapacagim.
Bu ilk farkli veriseti fareden geliyor. Mus Musculus tur adina ve ev faresi olarak yaygin isme sahip bu organizma da model organizma olarak calismalarda kullanildigi icin dizisi daha siklikla cikarilan diger bir organizma.
Bi dizilemeyi yapan, birlikte calistigim laboratuvardan cesitli BAM formatinda dizi dosyalari aldim.
Blog
Yeni Verisetinin Incelenmesi
Pipeline’i tasarlama asamasinda deneme amacli kullandigim onceki verinin cok kotu olmasi sebebiyle yeni bir veriseti aldim. Elbette deneme asamasinda birden fazla, farkli karakterlerde verisetleri kullanmak yararlidir. Ancak onceki veriseti anlamli birkac sonuc veremeyecek kadar kotuydu diyebilirim. Ayrintilarina [buradan]({% post_url 2012-07-06-eslestirme-ve-eslesmeyen-okumalari %}) gozatabilirsiniz.
Yeni veriseti, gene bir insan genomu verisi ve BAM dosyasinin boyutu 1.8 GB ve icinde eslenebilen ve eslenemeyen okumalari bulunduruyordu. Ben bam2fastq araciyla hem bu BAM dosyasini FASTQ dosyasina cevirirken hem de eslenebilen okumalardan ayiklayarak 0.
Blog
Tek FASTA Dosyasindan MegaBLAST'i Calistirmak - Duzenli Ifadeler
Asagida MegaBLAST’i FASTA dosyasi okuyarak calistirmak ve sonuclari bir dizinde toplayabilmek amaciyla yazdigim Perl scripti ve onun aciklamasi var. Bu script tasarlamakta oldugum pipeline’in onemli bir parcasi. Bu script ilk yazdigim olan ve sadece bir FASTA dosyasi uzerinden tum okumalara ulasabilen script.
1#!user/local/bin/perl 2$database = $ARGV[0]; 3$fasta = $ARGV[1]; #input file 4$sp = $ARGV[2]; #starting point 5$n = $ARGV[3] + $sp; 6 7if(!defined($n)){$n=12;} #set default number 8 9open FASTA, $fasta or die $!
Blog
FASTQ'dan FASTA'ya Donusturme Perl Scripti
FASTQ ve FASTA formatlari aslinda ayni bilgiyi iceren ancak birinde sadece herbir dizi icin iki satir daha az bilginin bulundugu dosya formatlari. Projemde onemli olan diger bir farklari ise FASTA formatinin direkt olarak MegaBLAST arama yapilabilmesi. Iste bu yuzden, genetik dizilim yapan makinelerin olusturdugu FASTQ formatini FASTA’ya cevirmem gerekiyor. Ve bu script pipeline’in ilk adimi.
Aslinda deneme amacli aldigim genetk dizilimin, bana bunu ulastiran tarafindan eslestirmesinin yapilmadigi icin, bir on adim olarak bu eslestirmeyi yapmistim.
Blog
Eşleştirme ve Eşleşmeyen Okumaları Çıkarma Sonuçları
Daha önce verinin sadece bir kısmı ile çalışıyordum ancak artık tamamıyla çalışacağım. Bu yüzden bana sıkıştırılmış halde gelen veriyi direkt çalışma klasörüme çıkardım ve onun üzerinden işlemler yaptım.
Başlangıç (FASTQ) dosyamın boyutu 2153988289 bayt (2 GB). Ve bwa aracılığıyla eşleştirmeden sonra toplamda 6004193 dizilim, ya da okuma, (sequences ya da reads) ortaya çıktı. Daha sonra eşleşmeyen okumaları çıkarmam sonrasında toplam okuma sayısı 551065 kadar azaldı ve 5493128 oldu. Yani verinin %9.
Blog
BWA İle Eşleştirme (Mapping - Alignment)
Bunu daha önce yazmayı unutmuşum. Aslında bahsetmiştim ancak nasıl yapıldığına dair bir şeyler yazmamışım ayrıca örnek komutlar da eklememişim.
BWA elimizdeki (FASTQ formatındaki) DNA dizilimini, referans genomunu (projemde bu insan genomu) alarak bir .sai dosyası oluşturuyor. Bu dosya dizinin ve referans genomunun eşleşmesi ile ilgili bilgiler taşiyor ve bu bilgileri kullanarak eşleşmeyenleri ayırabiliyorum.
İlk olarak aşağıdaki komut ile .sai dosyamızı oluşturuyoruz.
1bwa aln $NGSDATAROOT/bwa/human_genome37 ChIP_NoIndex_L001_R1_complete_filtered.fastq > complete_alignment.sai Oluşturduğumuz .sai dosyası çok da kullanışlı bir dosya değil, bu yüzden onu SAM dosyasına çevirerek, işlemlere devam ediyoruz.
Blog
SAM Dosyası - BAM Dosyası - samtools
Aslında programlamam gereken pipeline direkt olarak eşleşmeyen okumalar üzerinden analizler yapacak. Ancak böyle bir veri bulamadiığım için, elimdeki tek veri eşleşen ve eşleşmeyen okumaları içerdiği için önce eşleşenlerden kurtulmam gerekti.
Bunu daha önce de belirttiğim gibi bwa eşleştiricisi (aligner - mapper) ile yapıyorum. bwa bir dizi işlemden sonra SAM dosyası oluşturuyor ancak benim FASTQ dosyasına ihtiyacım var. Bunun için SAM dosyasını samtools1 ile benzer bir format olan BAM dosyasına çevirip, daha sonra da bam2fastq2 aracı ile FASTQ dosyamı elde edeceğim.
Blog
İlk Adım: Eşleşmeyen Okumaları Elde Etmek
Projemin ilk kısmı daha önce bahsettiğim gibi eşleşmeyen okumaları (unmapped reads) FASTQ dosyasından çıkarmak. Böylece, daha sonraki analizler için elimdeki ihtiyacım olmayan dizileri çıkarmış ve bu analizlerdeki iş yükünü azaltmış oluyorum.
Başından beri hedefim, tüm projeyi adım adım gerçekleştiren bir pipeline tasarlamak olduğu için bu işlemi bir Perl scripti ile yapacağım. Bu script pipeline’in ilk scripti ve laboratuvardan gelecek ham (raw) FASTQ formatındaki verinin girdi (input) olarak kullanılacağı yer. Aslında bu scripte ihtiyacım olmayacak, sadece elimdeki verinin eşlenebilen verileri de içermesi sebebiyle bu adımı ekledim.
Blog
Kontaminant (Kirletici) Analizi Projesi
Başlangıç olarak, araçlara, programlama diline, kısacası biyoenformatiğe alışabilmem için bana verilen bu ufak projeyi ayrıntılı olarak anlatacağım.
Biliyoruz ki, laboratuvar çalışmalarımızda ne kadar önlemeye çalışsak da kontaminant riski hep bulunuyor. Bunu ne kadar aza indirsek o kadar iyi, ki daha sonra bunun miktarını bulup, bunun üzerinden sonucumuzun bir başka değerlendirmesini de yapabiliriz. İşte bunu bulmak için bir yöntem, DNA analizi. Çalıştığınız örneğinizin DNA’sı dizileniyor ve bu DNA çeşitli programlarla analiz edilip, kirleten organizmaları DNA’larından ortaya çıkarabiliyoruz
Blog
FASTQ Formatı - FASTQ Dosyası
Bugün programı oluştururken kullanacağım “test” dizilimini aldım. İki adet FASTQ dosyasından oluşuyor, her biri sıkıştırılmış ama buna rağmen boyutları 6 GB civarı. Ben elbette çok zaman kaybetmek istemediğim için bu dosyalardan birinin sadece bir kısmını kullanacağım.
Amacım, bu FASTQ dosyalarındaki eşleşebilen okumaları BWA aracı ile bularak, daha sonra onları çıkarmak. Ve kalan eşleşemeyen okumaları MegaBLAST aracının anlayabileceği bir dilde (FASTA formatında) kaydetmek.
Bu arada tüm projeyi bir Unix bilgisayarda hazırladığım için birçok komut öğreniyorum, daha sonra bunları ayrıca yazmaya çalışacağım.
Tag: fastq quality filter
Blog
Kalite Satirinin Degerlendirilmesi - Quality Filter
Kirleten organizma (konaminant) analizi yapacak olan pipeline’i daha fazla gelistirmek, daha anlamli sonuclar elde etmek icin ilk adimlara (henuz fastq dosyasini isliyorken) kalite filtresi eklemeyi dusunduk. Boylece belirli bir esik degerinden dusuk okumalari daha o asamadan filtreleyerek daha guvenilir sonuclar elde elebilecegiz.
Bu kalite kontrolunu fastq dosyasinda her okumanin 4. satirini anlayarak yapacagiz. Bu 4. satir (aslinda okumanin dizileme kalite skoru), cesitli dizileme cihazlari tarafindan cesitli sekillerde yaziliyor (kodlaniyor) ve bu kodlamadan tekrar kalite skorunu elde ederek filtreleme uygulanmasi gerekiyor.
Tag: fastqc
Blog
Kalite Satirinin Degerlendirilmesi - Quality Filter
Kirleten organizma (konaminant) analizi yapacak olan pipeline’i daha fazla gelistirmek, daha anlamli sonuclar elde etmek icin ilk adimlara (henuz fastq dosyasini isliyorken) kalite filtresi eklemeyi dusunduk. Boylece belirli bir esik degerinden dusuk okumalari daha o asamadan filtreleyerek daha guvenilir sonuclar elde elebilecegiz.
Bu kalite kontrolunu fastq dosyasinda her okumanin 4. satirini anlayarak yapacagiz. Bu 4. satir (aslinda okumanin dizileme kalite skoru), cesitli dizileme cihazlari tarafindan cesitli sekillerde yaziliyor (kodlaniyor) ve bu kodlamadan tekrar kalite skorunu elde ederek filtreleme uygulanmasi gerekiyor.
Tag: fastx toolkit
Blog
Kalite Satirinin Degerlendirilmesi - Quality Filter
Kirleten organizma (konaminant) analizi yapacak olan pipeline’i daha fazla gelistirmek, daha anlamli sonuclar elde etmek icin ilk adimlara (henuz fastq dosyasini isliyorken) kalite filtresi eklemeyi dusunduk. Boylece belirli bir esik degerinden dusuk okumalari daha o asamadan filtreleyerek daha guvenilir sonuclar elde elebilecegiz.
Bu kalite kontrolunu fastq dosyasinda her okumanin 4. satirini anlayarak yapacagiz. Bu 4. satir (aslinda okumanin dizileme kalite skoru), cesitli dizileme cihazlari tarafindan cesitli sekillerde yaziliyor (kodlaniyor) ve bu kodlamadan tekrar kalite skorunu elde ederek filtreleme uygulanmasi gerekiyor.
Tag: filter
Blog
Kalite Satirinin Degerlendirilmesi - Quality Filter
Kirleten organizma (konaminant) analizi yapacak olan pipeline’i daha fazla gelistirmek, daha anlamli sonuclar elde etmek icin ilk adimlara (henuz fastq dosyasini isliyorken) kalite filtresi eklemeyi dusunduk. Boylece belirli bir esik degerinden dusuk okumalari daha o asamadan filtreleyerek daha guvenilir sonuclar elde elebilecegiz.
Bu kalite kontrolunu fastq dosyasinda her okumanin 4. satirini anlayarak yapacagiz. Bu 4. satir (aslinda okumanin dizileme kalite skoru), cesitli dizileme cihazlari tarafindan cesitli sekillerde yaziliyor (kodlaniyor) ve bu kodlamadan tekrar kalite skorunu elde ederek filtreleme uygulanmasi gerekiyor.
Tag: kalite
Blog
Kalite Satirinin Degerlendirilmesi - Quality Filter
Kirleten organizma (konaminant) analizi yapacak olan pipeline’i daha fazla gelistirmek, daha anlamli sonuclar elde etmek icin ilk adimlara (henuz fastq dosyasini isliyorken) kalite filtresi eklemeyi dusunduk. Boylece belirli bir esik degerinden dusuk okumalari daha o asamadan filtreleyerek daha guvenilir sonuclar elde elebilecegiz.
Bu kalite kontrolunu fastq dosyasinda her okumanin 4. satirini anlayarak yapacagiz. Bu 4. satir (aslinda okumanin dizileme kalite skoru), cesitli dizileme cihazlari tarafindan cesitli sekillerde yaziliyor (kodlaniyor) ve bu kodlamadan tekrar kalite skorunu elde ederek filtreleme uygulanmasi gerekiyor.
Tag: pipeline
Blog
Kalite Satirinin Degerlendirilmesi - Quality Filter
Kirleten organizma (konaminant) analizi yapacak olan pipeline’i daha fazla gelistirmek, daha anlamli sonuclar elde etmek icin ilk adimlara (henuz fastq dosyasini isliyorken) kalite filtresi eklemeyi dusunduk. Boylece belirli bir esik degerinden dusuk okumalari daha o asamadan filtreleyerek daha guvenilir sonuclar elde elebilecegiz.
Bu kalite kontrolunu fastq dosyasinda her okumanin 4. satirini anlayarak yapacagiz. Bu 4. satir (aslinda okumanin dizileme kalite skoru), cesitli dizileme cihazlari tarafindan cesitli sekillerde yaziliyor (kodlaniyor) ve bu kodlamadan tekrar kalite skorunu elde ederek filtreleme uygulanmasi gerekiyor.
Blog
Dorduncu Deneme Veriseti: Mus Musculus Genomu
Simdiye kadar ilk uc veriseti de insan genomuna aitti. Pipeline’i bu genomlarla deneyip, yer yer iyilestirmeler yaptim. Simdi ise baska organizmalarla da deneyip, daha fazla sonuc alip bunlari inceleyecegim ve gene gerekli iyilestirmeleri yapacagim.
Bu ilk farkli veriseti fareden geliyor. Mus Musculus tur adina ve ev faresi olarak yaygin isme sahip bu organizma da model organizma olarak calismalarda kullanildigi icin dizisi daha siklikla cikarilan diger bir organizma.
Bi dizilemeyi yapan, birlikte calistigim laboratuvardan cesitli BAM formatinda dizi dosyalari aldim.
Blog
Yeni Verisetinin Incelenmesi
Pipeline’i tasarlama asamasinda deneme amacli kullandigim onceki verinin cok kotu olmasi sebebiyle yeni bir veriseti aldim. Elbette deneme asamasinda birden fazla, farkli karakterlerde verisetleri kullanmak yararlidir. Ancak onceki veriseti anlamli birkac sonuc veremeyecek kadar kotuydu diyebilirim. Ayrintilarina [buradan]({% post_url 2012-07-06-eslestirme-ve-eslesmeyen-okumalari %}) gozatabilirsiniz.
Yeni veriseti, gene bir insan genomu verisi ve BAM dosyasinin boyutu 1.8 GB ve icinde eslenebilen ve eslenemeyen okumalari bulunduruyordu. Ben bam2fastq araciyla hem bu BAM dosyasini FASTQ dosyasina cevirirken hem de eslenebilen okumalardan ayiklayarak 0.
Blog
Dizileme Çalışmalarını Kirleten Organizmaları Tespit Etme
Bu yaz stajımda ilk olarak başlayacağım çalışma yavaş yavaş şekilleniyor. Bu çalışmada bir pipeline oluşturup, bunu laboratuvarlarda dizileme (sequencing) örneklerini kirleten organizmaları bulmaya çalışacağım.
Laboratuvarlarda birçok nedenden dolayı örnekler başka organizmalar ya da yabancı DNA tarafından kirlenebiliyor. Bunlar bakteri, maya olabilir ya da bir virüs DNA’sı da olabilir. Siz bir DNA’yı diziledikten sonra onun referansıyla eşleştirme çok az oranda çıkabiliyor. Bu da yabancı DNA’nın olabileceğini gösteriyor. Bir başka neden referans DNA’nın farklı olması da olabilir.
Blog
Pipeline ve Pipeline Geliştirme
Bugün aldığım tanıtım derslerinin devamında, pipeline ve pipeline geliştirme ile ilgili ayrıntılı bilgiler aldım. Pipeline, aslında bildiğimiz boru hattı demek, örneğin borularla petrolün bir yerden başka bir yere taşınması için kullanılan sistem. Bunun bilgisayar terminolojisinde anlamı ise bir elementin çıktısı, diğerinin girdisi olacak şekilde oluşturulmuş işleme elementleri zinciri. Böylece çok daha komplike işlemler pipeline oluşturularak, kolay ve düzenli bir biçimde gerçekleştiriliyor. Sanırım pipeline Türkçeye ardışık düzen olarak çevriliyor, gene de ben pipeline olarak kullanacağım.
Tag: prinseq
Blog
Kalite Satirinin Degerlendirilmesi - Quality Filter
Kirleten organizma (konaminant) analizi yapacak olan pipeline’i daha fazla gelistirmek, daha anlamli sonuclar elde etmek icin ilk adimlara (henuz fastq dosyasini isliyorken) kalite filtresi eklemeyi dusunduk. Boylece belirli bir esik degerinden dusuk okumalari daha o asamadan filtreleyerek daha guvenilir sonuclar elde elebilecegiz.
Bu kalite kontrolunu fastq dosyasinda her okumanin 4. satirini anlayarak yapacagiz. Bu 4. satir (aslinda okumanin dizileme kalite skoru), cesitli dizileme cihazlari tarafindan cesitli sekillerde yaziliyor (kodlaniyor) ve bu kodlamadan tekrar kalite skorunu elde ederek filtreleme uygulanmasi gerekiyor.
Tag: quality
Blog
Kalite Satirinin Degerlendirilmesi - Quality Filter
Kirleten organizma (konaminant) analizi yapacak olan pipeline’i daha fazla gelistirmek, daha anlamli sonuclar elde etmek icin ilk adimlara (henuz fastq dosyasini isliyorken) kalite filtresi eklemeyi dusunduk. Boylece belirli bir esik degerinden dusuk okumalari daha o asamadan filtreleyerek daha guvenilir sonuclar elde elebilecegiz.
Bu kalite kontrolunu fastq dosyasinda her okumanin 4. satirini anlayarak yapacagiz. Bu 4. satir (aslinda okumanin dizileme kalite skoru), cesitli dizileme cihazlari tarafindan cesitli sekillerde yaziliyor (kodlaniyor) ve bu kodlamadan tekrar kalite skorunu elde ederek filtreleme uygulanmasi gerekiyor.
Tag: quality filter
Blog
Kalite Satirinin Degerlendirilmesi - Quality Filter
Kirleten organizma (konaminant) analizi yapacak olan pipeline’i daha fazla gelistirmek, daha anlamli sonuclar elde etmek icin ilk adimlara (henuz fastq dosyasini isliyorken) kalite filtresi eklemeyi dusunduk. Boylece belirli bir esik degerinden dusuk okumalari daha o asamadan filtreleyerek daha guvenilir sonuclar elde elebilecegiz.
Bu kalite kontrolunu fastq dosyasinda her okumanin 4. satirini anlayarak yapacagiz. Bu 4. satir (aslinda okumanin dizileme kalite skoru), cesitli dizileme cihazlari tarafindan cesitli sekillerde yaziliyor (kodlaniyor) ve bu kodlamadan tekrar kalite skorunu elde ederek filtreleme uygulanmasi gerekiyor.
Tag: bam
Blog
Dorduncu Deneme Veriseti: Mus Musculus Genomu
Simdiye kadar ilk uc veriseti de insan genomuna aitti. Pipeline’i bu genomlarla deneyip, yer yer iyilestirmeler yaptim. Simdi ise baska organizmalarla da deneyip, daha fazla sonuc alip bunlari inceleyecegim ve gene gerekli iyilestirmeleri yapacagim.
Bu ilk farkli veriseti fareden geliyor. Mus Musculus tur adina ve ev faresi olarak yaygin isme sahip bu organizma da model organizma olarak calismalarda kullanildigi icin dizisi daha siklikla cikarilan diger bir organizma.
Bi dizilemeyi yapan, birlikte calistigim laboratuvardan cesitli BAM formatinda dizi dosyalari aldim.
Blog
Yeni Verisetinin Incelenmesi
Pipeline’i tasarlama asamasinda deneme amacli kullandigim onceki verinin cok kotu olmasi sebebiyle yeni bir veriseti aldim. Elbette deneme asamasinda birden fazla, farkli karakterlerde verisetleri kullanmak yararlidir. Ancak onceki veriseti anlamli birkac sonuc veremeyecek kadar kotuydu diyebilirim. Ayrintilarina [buradan]({% post_url 2012-07-06-eslestirme-ve-eslesmeyen-okumalari %}) gozatabilirsiniz.
Yeni veriseti, gene bir insan genomu verisi ve BAM dosyasinin boyutu 1.8 GB ve icinde eslenebilen ve eslenemeyen okumalari bulunduruyordu. Ben bam2fastq araciyla hem bu BAM dosyasini FASTQ dosyasina cevirirken hem de eslenebilen okumalardan ayiklayarak 0.
Blog
SAM Dosyası - BAM Dosyası - samtools
Aslında programlamam gereken pipeline direkt olarak eşleşmeyen okumalar üzerinden analizler yapacak. Ancak böyle bir veri bulamadiığım için, elimdeki tek veri eşleşen ve eşleşmeyen okumaları içerdiği için önce eşleşenlerden kurtulmam gerekti.
Bunu daha önce de belirttiğim gibi bwa eşleştiricisi (aligner - mapper) ile yapıyorum. bwa bir dizi işlemden sonra SAM dosyası oluşturuyor ancak benim FASTQ dosyasına ihtiyacım var. Bunun için SAM dosyasını samtools1 ile benzer bir format olan BAM dosyasına çevirip, daha sonra da bam2fastq2 aracı ile FASTQ dosyamı elde edeceğim.
Tag: bam2fastq
Blog
Dorduncu Deneme Veriseti: Mus Musculus Genomu
Simdiye kadar ilk uc veriseti de insan genomuna aitti. Pipeline’i bu genomlarla deneyip, yer yer iyilestirmeler yaptim. Simdi ise baska organizmalarla da deneyip, daha fazla sonuc alip bunlari inceleyecegim ve gene gerekli iyilestirmeleri yapacagim.
Bu ilk farkli veriseti fareden geliyor. Mus Musculus tur adina ve ev faresi olarak yaygin isme sahip bu organizma da model organizma olarak calismalarda kullanildigi icin dizisi daha siklikla cikarilan diger bir organizma.
Bi dizilemeyi yapan, birlikte calistigim laboratuvardan cesitli BAM formatinda dizi dosyalari aldim.
Blog
SAM Dosyası - BAM Dosyası - samtools
Aslında programlamam gereken pipeline direkt olarak eşleşmeyen okumalar üzerinden analizler yapacak. Ancak böyle bir veri bulamadiığım için, elimdeki tek veri eşleşen ve eşleşmeyen okumaları içerdiği için önce eşleşenlerden kurtulmam gerekti.
Bunu daha önce de belirttiğim gibi bwa eşleştiricisi (aligner - mapper) ile yapıyorum. bwa bir dizi işlemden sonra SAM dosyası oluşturuyor ancak benim FASTQ dosyasına ihtiyacım var. Bunun için SAM dosyasını samtools1 ile benzer bir format olan BAM dosyasına çevirip, daha sonra da bam2fastq2 aracı ile FASTQ dosyamı elde edeceğim.
Tag: eşleşmeyen okuma
Blog
Dorduncu Deneme Veriseti: Mus Musculus Genomu
Simdiye kadar ilk uc veriseti de insan genomuna aitti. Pipeline’i bu genomlarla deneyip, yer yer iyilestirmeler yaptim. Simdi ise baska organizmalarla da deneyip, daha fazla sonuc alip bunlari inceleyecegim ve gene gerekli iyilestirmeleri yapacagim.
Bu ilk farkli veriseti fareden geliyor. Mus Musculus tur adina ve ev faresi olarak yaygin isme sahip bu organizma da model organizma olarak calismalarda kullanildigi icin dizisi daha siklikla cikarilan diger bir organizma.
Bi dizilemeyi yapan, birlikte calistigim laboratuvardan cesitli BAM formatinda dizi dosyalari aldim.
Blog
Eşleştirme ve Eşleşmeyen Okumaları Çıkarma Sonuçları
Daha önce verinin sadece bir kısmı ile çalışıyordum ancak artık tamamıyla çalışacağım. Bu yüzden bana sıkıştırılmış halde gelen veriyi direkt çalışma klasörüme çıkardım ve onun üzerinden işlemler yaptım.
Başlangıç (FASTQ) dosyamın boyutu 2153988289 bayt (2 GB). Ve bwa aracılığıyla eşleştirmeden sonra toplamda 6004193 dizilim, ya da okuma, (sequences ya da reads) ortaya çıktı. Daha sonra eşleşmeyen okumaları çıkarmam sonrasında toplam okuma sayısı 551065 kadar azaldı ve 5493128 oldu. Yani verinin %9.
Blog
SAM Dosyası - BAM Dosyası - samtools
Aslında programlamam gereken pipeline direkt olarak eşleşmeyen okumalar üzerinden analizler yapacak. Ancak böyle bir veri bulamadiığım için, elimdeki tek veri eşleşen ve eşleşmeyen okumaları içerdiği için önce eşleşenlerden kurtulmam gerekti.
Bunu daha önce de belirttiğim gibi bwa eşleştiricisi (aligner - mapper) ile yapıyorum. bwa bir dizi işlemden sonra SAM dosyası oluşturuyor ancak benim FASTQ dosyasına ihtiyacım var. Bunun için SAM dosyasını samtools1 ile benzer bir format olan BAM dosyasına çevirip, daha sonra da bam2fastq2 aracı ile FASTQ dosyamı elde edeceğim.
Blog
İlk Adım: Eşleşmeyen Okumaları Elde Etmek
Projemin ilk kısmı daha önce bahsettiğim gibi eşleşmeyen okumaları (unmapped reads) FASTQ dosyasından çıkarmak. Böylece, daha sonraki analizler için elimdeki ihtiyacım olmayan dizileri çıkarmış ve bu analizlerdeki iş yükünü azaltmış oluyorum.
Başından beri hedefim, tüm projeyi adım adım gerçekleştiren bir pipeline tasarlamak olduğu için bu işlemi bir Perl scripti ile yapacağım. Bu script pipeline’in ilk scripti ve laboratuvardan gelecek ham (raw) FASTQ formatındaki verinin girdi (input) olarak kullanılacağı yer. Aslında bu scripte ihtiyacım olmayacak, sadece elimdeki verinin eşlenebilen verileri de içermesi sebebiyle bu adımı ekledim.
Blog
Dizileme Çalışmalarını Kirleten Organizmaları Tespit Etme
Bu yaz stajımda ilk olarak başlayacağım çalışma yavaş yavaş şekilleniyor. Bu çalışmada bir pipeline oluşturup, bunu laboratuvarlarda dizileme (sequencing) örneklerini kirleten organizmaları bulmaya çalışacağım.
Laboratuvarlarda birçok nedenden dolayı örnekler başka organizmalar ya da yabancı DNA tarafından kirlenebiliyor. Bunlar bakteri, maya olabilir ya da bir virüs DNA’sı da olabilir. Siz bir DNA’yı diziledikten sonra onun referansıyla eşleştirme çok az oranda çıkabiliyor. Bu da yabancı DNA’nın olabileceğini gösteriyor. Bir başka neden referans DNA’nın farklı olması da olabilir.
Tag: ev faresi
Blog
Dorduncu Deneme Veriseti: Mus Musculus Genomu
Simdiye kadar ilk uc veriseti de insan genomuna aitti. Pipeline’i bu genomlarla deneyip, yer yer iyilestirmeler yaptim. Simdi ise baska organizmalarla da deneyip, daha fazla sonuc alip bunlari inceleyecegim ve gene gerekli iyilestirmeleri yapacagim.
Bu ilk farkli veriseti fareden geliyor. Mus Musculus tur adina ve ev faresi olarak yaygin isme sahip bu organizma da model organizma olarak calismalarda kullanildigi icin dizisi daha siklikla cikarilan diger bir organizma.
Bi dizilemeyi yapan, birlikte calistigim laboratuvardan cesitli BAM formatinda dizi dosyalari aldim.
Tag: insan
Blog
Dorduncu Deneme Veriseti: Mus Musculus Genomu
Simdiye kadar ilk uc veriseti de insan genomuna aitti. Pipeline’i bu genomlarla deneyip, yer yer iyilestirmeler yaptim. Simdi ise baska organizmalarla da deneyip, daha fazla sonuc alip bunlari inceleyecegim ve gene gerekli iyilestirmeleri yapacagim.
Bu ilk farkli veriseti fareden geliyor. Mus Musculus tur adina ve ev faresi olarak yaygin isme sahip bu organizma da model organizma olarak calismalarda kullanildigi icin dizisi daha siklikla cikarilan diger bir organizma.
Bi dizilemeyi yapan, birlikte calistigim laboratuvardan cesitli BAM formatinda dizi dosyalari aldim.
Tag: insan genomu
Blog
Dorduncu Deneme Veriseti: Mus Musculus Genomu
Simdiye kadar ilk uc veriseti de insan genomuna aitti. Pipeline’i bu genomlarla deneyip, yer yer iyilestirmeler yaptim. Simdi ise baska organizmalarla da deneyip, daha fazla sonuc alip bunlari inceleyecegim ve gene gerekli iyilestirmeleri yapacagim.
Bu ilk farkli veriseti fareden geliyor. Mus Musculus tur adina ve ev faresi olarak yaygin isme sahip bu organizma da model organizma olarak calismalarda kullanildigi icin dizisi daha siklikla cikarilan diger bir organizma.
Bi dizilemeyi yapan, birlikte calistigim laboratuvardan cesitli BAM formatinda dizi dosyalari aldim.
Blog
BWA İle Eşleştirme (Mapping - Alignment)
Bunu daha önce yazmayı unutmuşum. Aslında bahsetmiştim ancak nasıl yapıldığına dair bir şeyler yazmamışım ayrıca örnek komutlar da eklememişim.
BWA elimizdeki (FASTQ formatındaki) DNA dizilimini, referans genomunu (projemde bu insan genomu) alarak bir .sai dosyası oluşturuyor. Bu dosya dizinin ve referans genomunun eşleşmesi ile ilgili bilgiler taşiyor ve bu bilgileri kullanarak eşleşmeyenleri ayırabiliyorum.
İlk olarak aşağıdaki komut ile .sai dosyamızı oluşturuyoruz.
1bwa aln $NGSDATAROOT/bwa/human_genome37 ChIP_NoIndex_L001_R1_complete_filtered.fastq > complete_alignment.sai Oluşturduğumuz .sai dosyası çok da kullanışlı bir dosya değil, bu yüzden onu SAM dosyasına çevirerek, işlemlere devam ediyoruz.
Blog
BWA (Burrows-Wheeler Aligner) Hizalayıcı - Eşleştirici
Önceki yazımda belirttiğim gibi bir eşleştirici (aligner ya da mapper) kullanarak elimdeki verinin referans genomu ile ne derece eşlestiğini bulmaya çalışacağım. Daha sonra eşleşmeyen kısmıyla birtakım analizler yapacağım.
BWA (Burrows-Wheeler Aligner) görece kısa dizilimleri insan genomu gibi uzun referans genomlarıyla eşleştiren bir program. 200bp (bp: baz çifti) uzunluğuna kadar bwa-short algoritması, 200bp - 100kbp arası ise BWA-SW algoritması kullanılıyor.
Hizalayıcı - eşleştirici seçmede birçok faktör rol oynuyor. Birçok bu tip araç var ve farklı özelliklere sahipler.
Blog
Biyoenformatik Nedir? Biyoenformatik'in Tanımı
Birçok organizmanın ve son olarak da 2001’de insan genomunun çıkarılmasıyla, tüm 3 milyar baz çiftinin diziliminin elde edilmesiyle, karşımıza bu bilgiyi farklı şekillerde kullanacak olan alanlar çıktı.Bu genleri anlamaya çalışan, bu genlerden oluşacak proteinleri belirlemeye çalışan alanların yanında bu bilginin analizini yapma ihtiyacı da Biyoenformatik alanını doğurdu.
Biyoenformatik, biyolojik bilginin bilgisayarlar ve istatistiksel teknikler kullanılarak analiz edilmesidir; başka bir deyişle, biyoenformatik, biyolojik araştırmaları iyileştirmek ve hızlandırmak için bilgisayar veri tabanları ve algoritmaları geliştirme ve onlardan yarar sağlama bilimidir [1].
Tag: mus musculus
Blog
Dorduncu Deneme Veriseti: Mus Musculus Genomu
Simdiye kadar ilk uc veriseti de insan genomuna aitti. Pipeline’i bu genomlarla deneyip, yer yer iyilestirmeler yaptim. Simdi ise baska organizmalarla da deneyip, daha fazla sonuc alip bunlari inceleyecegim ve gene gerekli iyilestirmeleri yapacagim.
Bu ilk farkli veriseti fareden geliyor. Mus Musculus tur adina ve ev faresi olarak yaygin isme sahip bu organizma da model organizma olarak calismalarda kullanildigi icin dizisi daha siklikla cikarilan diger bir organizma.
Bi dizilemeyi yapan, birlikte calistigim laboratuvardan cesitli BAM formatinda dizi dosyalari aldim.
Tag: ambiguous hit
Blog
Inceleme Sonuclarini "Ambiguous" Olarak Ayirmak
Cesitli veritabanlarina karsi yaptigim aramalardan aldigim sonuclari incelerken, bunlari cesitli esik degerleri ile degerlendirmek ile beraber belirlenen esik degerlerinin uzerinde ya da altinda olan hitleri “Ambiguous” (belirsiz, cok anlamli) ya da “Unique” (essiz, tek) olarak ayirarak daha da anlamli hale getirmeye calisiyorum.
“Ambiguous” olarak, her bir megablast dosyasinda esik degerlerine uygun ancak birden fazla farkli organizmayi iceren hitleri etiketliyorum. Eger her esik degerine uygun hit, tek bir dosya icinde her zaman ayni organizmaya ait ise bu durumda yaptigim sey onu “unique” olarak etiketlemek.
Tag: anahtar
Blog
Inceleme Sonuclarini "Ambiguous" Olarak Ayirmak
Cesitli veritabanlarina karsi yaptigim aramalardan aldigim sonuclari incelerken, bunlari cesitli esik degerleri ile degerlendirmek ile beraber belirlenen esik degerlerinin uzerinde ya da altinda olan hitleri “Ambiguous” (belirsiz, cok anlamli) ya da “Unique” (essiz, tek) olarak ayirarak daha da anlamli hale getirmeye calisiyorum.
“Ambiguous” olarak, her bir megablast dosyasinda esik degerlerine uygun ancak birden fazla farkli organizmayi iceren hitleri etiketliyorum. Eger her esik degerine uygun hit, tek bir dosya icinde her zaman ayni organizmaya ait ise bu durumda yaptigim sey onu “unique” olarak etiketlemek.
Tag: belirsiz
Blog
Inceleme Sonuclarini "Ambiguous" Olarak Ayirmak
Cesitli veritabanlarina karsi yaptigim aramalardan aldigim sonuclari incelerken, bunlari cesitli esik degerleri ile degerlendirmek ile beraber belirlenen esik degerlerinin uzerinde ya da altinda olan hitleri “Ambiguous” (belirsiz, cok anlamli) ya da “Unique” (essiz, tek) olarak ayirarak daha da anlamli hale getirmeye calisiyorum.
“Ambiguous” olarak, her bir megablast dosyasinda esik degerlerine uygun ancak birden fazla farkli organizmayi iceren hitleri etiketliyorum. Eger her esik degerine uygun hit, tek bir dosya icinde her zaman ayni organizmaya ait ise bu durumda yaptigim sey onu “unique” olarak etiketlemek.
Tag: çok anlamlı
Blog
Inceleme Sonuclarini "Ambiguous" Olarak Ayirmak
Cesitli veritabanlarina karsi yaptigim aramalardan aldigim sonuclari incelerken, bunlari cesitli esik degerleri ile degerlendirmek ile beraber belirlenen esik degerlerinin uzerinde ya da altinda olan hitleri “Ambiguous” (belirsiz, cok anlamli) ya da “Unique” (essiz, tek) olarak ayirarak daha da anlamli hale getirmeye calisiyorum.
“Ambiguous” olarak, her bir megablast dosyasinda esik degerlerine uygun ancak birden fazla farkli organizmayi iceren hitleri etiketliyorum. Eger her esik degerine uygun hit, tek bir dosya icinde her zaman ayni organizmaya ait ise bu durumda yaptigim sey onu “unique” olarak etiketlemek.
Tag: değer
Blog
Inceleme Sonuclarini "Ambiguous" Olarak Ayirmak
Cesitli veritabanlarina karsi yaptigim aramalardan aldigim sonuclari incelerken, bunlari cesitli esik degerleri ile degerlendirmek ile beraber belirlenen esik degerlerinin uzerinde ya da altinda olan hitleri “Ambiguous” (belirsiz, cok anlamli) ya da “Unique” (essiz, tek) olarak ayirarak daha da anlamli hale getirmeye calisiyorum.
“Ambiguous” olarak, her bir megablast dosyasinda esik degerlerine uygun ancak birden fazla farkli organizmayi iceren hitleri etiketliyorum. Eger her esik degerine uygun hit, tek bir dosya icinde her zaman ayni organizmaya ait ise bu durumda yaptigim sey onu “unique” olarak etiketlemek.
Tag: eşsiz
Blog
Inceleme Sonuclarini "Ambiguous" Olarak Ayirmak
Cesitli veritabanlarina karsi yaptigim aramalardan aldigim sonuclari incelerken, bunlari cesitli esik degerleri ile degerlendirmek ile beraber belirlenen esik degerlerinin uzerinde ya da altinda olan hitleri “Ambiguous” (belirsiz, cok anlamli) ya da “Unique” (essiz, tek) olarak ayirarak daha da anlamli hale getirmeye calisiyorum.
“Ambiguous” olarak, her bir megablast dosyasinda esik degerlerine uygun ancak birden fazla farkli organizmayi iceren hitleri etiketliyorum. Eger her esik degerine uygun hit, tek bir dosya icinde her zaman ayni organizmaya ait ise bu durumda yaptigim sey onu “unique” olarak etiketlemek.
Tag: key
Blog
Inceleme Sonuclarini "Ambiguous" Olarak Ayirmak
Cesitli veritabanlarina karsi yaptigim aramalardan aldigim sonuclari incelerken, bunlari cesitli esik degerleri ile degerlendirmek ile beraber belirlenen esik degerlerinin uzerinde ya da altinda olan hitleri “Ambiguous” (belirsiz, cok anlamli) ya da “Unique” (essiz, tek) olarak ayirarak daha da anlamli hale getirmeye calisiyorum.
“Ambiguous” olarak, her bir megablast dosyasinda esik degerlerine uygun ancak birden fazla farkli organizmayi iceren hitleri etiketliyorum. Eger her esik degerine uygun hit, tek bir dosya icinde her zaman ayni organizmaya ait ise bu durumda yaptigim sey onu “unique” olarak etiketlemek.
Tag: tek
Blog
Inceleme Sonuclarini "Ambiguous" Olarak Ayirmak
Cesitli veritabanlarina karsi yaptigim aramalardan aldigim sonuclari incelerken, bunlari cesitli esik degerleri ile degerlendirmek ile beraber belirlenen esik degerlerinin uzerinde ya da altinda olan hitleri “Ambiguous” (belirsiz, cok anlamli) ya da “Unique” (essiz, tek) olarak ayirarak daha da anlamli hale getirmeye calisiyorum.
“Ambiguous” olarak, her bir megablast dosyasinda esik degerlerine uygun ancak birden fazla farkli organizmayi iceren hitleri etiketliyorum. Eger her esik degerine uygun hit, tek bir dosya icinde her zaman ayni organizmaya ait ise bu durumda yaptigim sey onu “unique” olarak etiketlemek.
Tag: unique hit
Blog
Inceleme Sonuclarini "Ambiguous" Olarak Ayirmak
Cesitli veritabanlarina karsi yaptigim aramalardan aldigim sonuclari incelerken, bunlari cesitli esik degerleri ile degerlendirmek ile beraber belirlenen esik degerlerinin uzerinde ya da altinda olan hitleri “Ambiguous” (belirsiz, cok anlamli) ya da “Unique” (essiz, tek) olarak ayirarak daha da anlamli hale getirmeye calisiyorum.
“Ambiguous” olarak, her bir megablast dosyasinda esik degerlerine uygun ancak birden fazla farkli organizmayi iceren hitleri etiketliyorum. Eger her esik degerine uygun hit, tek bir dosya icinde her zaman ayni organizmaya ait ise bu durumda yaptigim sey onu “unique” olarak etiketlemek.
Tag: value
Blog
Inceleme Sonuclarini "Ambiguous" Olarak Ayirmak
Cesitli veritabanlarina karsi yaptigim aramalardan aldigim sonuclari incelerken, bunlari cesitli esik degerleri ile degerlendirmek ile beraber belirlenen esik degerlerinin uzerinde ya da altinda olan hitleri “Ambiguous” (belirsiz, cok anlamli) ya da “Unique” (essiz, tek) olarak ayirarak daha da anlamli hale getirmeye calisiyorum.
“Ambiguous” olarak, her bir megablast dosyasinda esik degerlerine uygun ancak birden fazla farkli organizmayi iceren hitleri etiketliyorum. Eger her esik degerine uygun hit, tek bir dosya icinde her zaman ayni organizmaya ait ise bu durumda yaptigim sey onu “unique” olarak etiketlemek.
Tag: dataset
Blog
Ikinci Veriseti Inceleme Sonuclari
Daha az eslenemeyen okumalara sahip ikinci verisetinin incelemesini tamamladim. Bu oncekine gore daha iyi bir dizileme ornegi oldugu icin aldigim sonuclar da oldukca tutarliydi. Insan genomuna ait bir diziden inceleme sonra asagidaki sonuclari elde ettim.
LIST OF ORGANISMS AND THEIR NUMBER OF OCCURENCES Ambiguous hit 1323 Homo sapiens 312 Pan troglodytes 25 Pongo abelii 18 Nomascus leucogenys 17 Halomonas sp. GFAJ-1 7 Callithrix jacchus 4 Macaca mulatta 3 Oryctolagus cuniculus 2 Loxodonta africana 1 Cavia porcellus 1 “Ambiguous hit” tanimini baska bir yazida aciklayacagim.
Tag: refseq
Blog
Ikinci Veriseti Inceleme Sonuclari
Daha az eslenemeyen okumalara sahip ikinci verisetinin incelemesini tamamladim. Bu oncekine gore daha iyi bir dizileme ornegi oldugu icin aldigim sonuclar da oldukca tutarliydi. Insan genomuna ait bir diziden inceleme sonra asagidaki sonuclari elde ettim.
LIST OF ORGANISMS AND THEIR NUMBER OF OCCURENCES Ambiguous hit 1323 Homo sapiens 312 Pan troglodytes 25 Pongo abelii 18 Nomascus leucogenys 17 Halomonas sp. GFAJ-1 7 Callithrix jacchus 4 Macaca mulatta 3 Oryctolagus cuniculus 2 Loxodonta africana 1 Cavia porcellus 1 “Ambiguous hit” tanimini baska bir yazida aciklayacagim.
Blog
Duzenli Ifadeler ile Tur Ismini Elde Etmek
Projemin sonunda kullaniciya olasi kirleten organizmalarin adlarini (Latince tur isimleri) gosterecegim icin, MegaBLAST sonuclarindaki erisim numaralarini (accession number) kullanarak her dizi icin organizma adlarini elde etmem gerekiyor. Sequence Retrival System (SRS) adinda, HUSAR sunucularinda bulunan baska bir sistem ile bunu yapabiliyorum.
SRS’ten organizma adini ogrenebilmem icin Unix komut satirinda “getz” komutuyla birlikte veritabani ismi, erisim numarasi ve ogrenmek istedigim alani yazmam yetiyor. Asagida, bu isi yapabilen ornek bir kod bulabilirsiniz.
Blog
Bir MegaBLAST Ciktisi Icerigi - RefSeq Veritabani
Asagida, deneme FASTA dosyasini refseq_genomic veritabaninda arayarak elde ettigim dosyadan, bir hitin ayrintilarini goruyoruz.
>>>>refseq_genomic_complete3: AC_000033_0310 Continuation (311 of 1357) of AC_000033 from base 31000001 (AC_000033 Mus musculus strain mixed chromosome 11, alternate assembly Mm_Celera, whole genome shotgun sequence. 2/2012) Length = 110000 Score = 115 bits (58), Expect = 4e-22 Identities = 74/79 (93%), Gaps = 2/79 (2%) Strand = Plus / Minus Query: 1 ctctctctgtct-tctctctctctctgtctctctctctttctctctcttctctctctctc 59 |||||||||||| ||| ||||||||| ||||||||||| ||||||||||||||||||||| Sbjct: 89773 ctctctctgtctgtctttctctctctctctctctctctctctctctcttctctctctctc 89714 Query: 60 tttctctctgccctctctc 78 ||||||||| ||||||||| Sbjct: 89713 tttctctct-ccctctctc 89696 Ayrintilarda, ilk olarak >>>> karakterleriyle hit ile ilgili baslik bilgisi veriyor.
Blog
Veritabani Secimi
Bu projedeki amacim olasi kirleten organizmalari (kontaminantlari) bulmak. Dolayisiyla genis bir veritabanina ihtiyacim var. Ancak veritabanini genis tutmak boyle bir avantaj sagliyorken, her dizi icin o veritabaninda arama yapmak oldukca fazla bilgisayar gucu ve zaman gerektiriyor. Bu yuzden projemi gelistirirken, cesitli veritabanlarini da inceliyorum. Ve ayrica bunlari nasil kisitlayarak, amacim icin en uygun hale getirebilecegimi arastiriyorum.
Ilk olarak NCBI’in Reference Sequence (Kaynak Dizi ya da Referans Sekans) – RefSeq – veritabaniyla basladim.
Tag: refseq_dna
Blog
Ikinci Veriseti Inceleme Sonuclari
Daha az eslenemeyen okumalara sahip ikinci verisetinin incelemesini tamamladim. Bu oncekine gore daha iyi bir dizileme ornegi oldugu icin aldigim sonuclar da oldukca tutarliydi. Insan genomuna ait bir diziden inceleme sonra asagidaki sonuclari elde ettim.
LIST OF ORGANISMS AND THEIR NUMBER OF OCCURENCES Ambiguous hit 1323 Homo sapiens 312 Pan troglodytes 25 Pongo abelii 18 Nomascus leucogenys 17 Halomonas sp. GFAJ-1 7 Callithrix jacchus 4 Macaca mulatta 3 Oryctolagus cuniculus 2 Loxodonta africana 1 Cavia porcellus 1 “Ambiguous hit” tanimini baska bir yazida aciklayacagim.
Blog
Duzenli Ifadeler ile Tur Ismini Elde Etmek
Projemin sonunda kullaniciya olasi kirleten organizmalarin adlarini (Latince tur isimleri) gosterecegim icin, MegaBLAST sonuclarindaki erisim numaralarini (accession number) kullanarak her dizi icin organizma adlarini elde etmem gerekiyor. Sequence Retrival System (SRS) adinda, HUSAR sunucularinda bulunan baska bir sistem ile bunu yapabiliyorum.
SRS’ten organizma adini ogrenebilmem icin Unix komut satirinda “getz” komutuyla birlikte veritabani ismi, erisim numarasi ve ogrenmek istedigim alani yazmam yetiyor. Asagida, bu isi yapabilen ornek bir kod bulabilirsiniz.
Tag: refseq_genomic
Blog
Ikinci Veriseti Inceleme Sonuclari
Daha az eslenemeyen okumalara sahip ikinci verisetinin incelemesini tamamladim. Bu oncekine gore daha iyi bir dizileme ornegi oldugu icin aldigim sonuclar da oldukca tutarliydi. Insan genomuna ait bir diziden inceleme sonra asagidaki sonuclari elde ettim.
LIST OF ORGANISMS AND THEIR NUMBER OF OCCURENCES Ambiguous hit 1323 Homo sapiens 312 Pan troglodytes 25 Pongo abelii 18 Nomascus leucogenys 17 Halomonas sp. GFAJ-1 7 Callithrix jacchus 4 Macaca mulatta 3 Oryctolagus cuniculus 2 Loxodonta africana 1 Cavia porcellus 1 “Ambiguous hit” tanimini baska bir yazida aciklayacagim.
Blog
Veritabanina Gore Bir Komutun Calisma Suresi - CPU Runtime
Calisilan dosyalar, veritabanları buyuk olunca ve yeterince bilgisayar gucune sahip olmayınca, her seyden once olcmemiz gereken nasil en etkili ve kisa surede sonucu alabiliyor olmamizdir.
Özellikle projemde, farkli veritabanları ve farkli parametreler kullanarak, bunları arastiriyorum.
Şimdilik dort veritabani deniyorum, bunlar: nrnuc, ensembl_cdna, honest ve refseq_genomic. Ayrica, bunu farkli iki kelime uzunluğuna gore de yapacagim. Kelime uzunluğu (word size) MegaBLAST’in ararken tam olarak eslestirecegi baz cifti sayisi. Yani elimde 151 baz ciftine sahip bir dizilim varsa, ve eger kelime uzunluğu 50 olarak belirlenmişse, bu 151 baz cifti icinden herhangi bir yerden baslayan ama arka arkaya en az 50 bazin dizilendiği kisimlar aranacak.
Tag: veriseti
Blog
Ikinci Veriseti Inceleme Sonuclari
Daha az eslenemeyen okumalara sahip ikinci verisetinin incelemesini tamamladim. Bu oncekine gore daha iyi bir dizileme ornegi oldugu icin aldigim sonuclar da oldukca tutarliydi. Insan genomuna ait bir diziden inceleme sonra asagidaki sonuclari elde ettim.
LIST OF ORGANISMS AND THEIR NUMBER OF OCCURENCES Ambiguous hit 1323 Homo sapiens 312 Pan troglodytes 25 Pongo abelii 18 Nomascus leucogenys 17 Halomonas sp. GFAJ-1 7 Callithrix jacchus 4 Macaca mulatta 3 Oryctolagus cuniculus 2 Loxodonta africana 1 Cavia porcellus 1 “Ambiguous hit” tanimini baska bir yazida aciklayacagim.
Tag: blast
Blog
Birden Fazla Dizi Dosyalarindan MegaBLAST'i Calistirmak
Asagidaki scripti, pipeline’in MegaBLAST aramasini daha hizli yapabilmek icin dusundugumuz bir teknige uygun olabilmesi icin yazdim. Yaptigi sey, her okuma icin olusturulmus ve formatlanmis dizi dosyalarini kullanarak veritabanlarinda belirtilen baslangic noktasi ve okuma sayisi ile arama yapmak.
1#!user/local/bin/perl 2 3$database = $ARGV[0]; 4$dir = $ARGV[1]; #directory for sequences 5$sp = $ARGV[2]; #starting point 6$n = $ARGV[3] + $sp; 7 8while (1) { 9 system("blastplus -programname=megablast $dir/read_$sp.seq $database -OUTFILE=read_$sp.megablast -nobatch -d"); 10 $sp++; 11 last if ($sp == $n); 12} Burada her sey gercekten cok basit bir programlama ile isliyor.
Blog
Bir MegaBLAST Ciktisi Icerigi - RefSeq Veritabani
Asagida, deneme FASTA dosyasini refseq_genomic veritabaninda arayarak elde ettigim dosyadan, bir hitin ayrintilarini goruyoruz.
>>>>refseq_genomic_complete3: AC_000033_0310 Continuation (311 of 1357) of AC_000033 from base 31000001 (AC_000033 Mus musculus strain mixed chromosome 11, alternate assembly Mm_Celera, whole genome shotgun sequence. 2/2012) Length = 110000 Score = 115 bits (58), Expect = 4e-22 Identities = 74/79 (93%), Gaps = 2/79 (2%) Strand = Plus / Minus Query: 1 ctctctctgtct-tctctctctctctgtctctctctctttctctctcttctctctctctc 59 |||||||||||| ||| ||||||||| ||||||||||| ||||||||||||||||||||| Sbjct: 89773 ctctctctgtctgtctttctctctctctctctctctctctctctctcttctctctctctc 89714 Query: 60 tttctctctgccctctctc 78 ||||||||| ||||||||| Sbjct: 89713 tttctctct-ccctctctc 89696 Ayrintilarda, ilk olarak >>>> karakterleriyle hit ile ilgili baslik bilgisi veriyor.
Blog
MegaBLAST - Dizilerdeki Benzerlikleri Bulma Aracı
MegaBLAST, HUSAR paketinde bulunan, BLAST (Basic Local Alignment Search Tool) paketinin bir parçası. Ayrıca BLASTN’in bir değişik türü. MegaBLAST uzun dizileri BLASTN’den daha etkili bir şekilde işliyor ve hem de çok daha hızlı işlem yapiyor ancak daha az duyarlı. Bu yüzden benzer dizileri geniş veri tabanlarında aramaya çok uygun bir araç.
Yazacağım program çoklu dizilim barındıran FASTA dosyasını alacak ve megablast komutunu çalıştıracak. Daha sonra da her okuma için bir .
Tag: blastplus
Blog
Birden Fazla Dizi Dosyalarindan MegaBLAST'i Calistirmak
Asagidaki scripti, pipeline’in MegaBLAST aramasini daha hizli yapabilmek icin dusundugumuz bir teknige uygun olabilmesi icin yazdim. Yaptigi sey, her okuma icin olusturulmus ve formatlanmis dizi dosyalarini kullanarak veritabanlarinda belirtilen baslangic noktasi ve okuma sayisi ile arama yapmak.
1#!user/local/bin/perl 2 3$database = $ARGV[0]; 4$dir = $ARGV[1]; #directory for sequences 5$sp = $ARGV[2]; #starting point 6$n = $ARGV[3] + $sp; 7 8while (1) { 9 system("blastplus -programname=megablast $dir/read_$sp.seq $database -OUTFILE=read_$sp.megablast -nobatch -d"); 10 $sp++; 11 last if ($sp == $n); 12} Burada her sey gercekten cok basit bir programlama ile isliyor.
Tag: dizi
Blog
Birden Fazla Dizi Dosyalarindan MegaBLAST'i Calistirmak
Asagidaki scripti, pipeline’in MegaBLAST aramasini daha hizli yapabilmek icin dusundugumuz bir teknige uygun olabilmesi icin yazdim. Yaptigi sey, her okuma icin olusturulmus ve formatlanmis dizi dosyalarini kullanarak veritabanlarinda belirtilen baslangic noktasi ve okuma sayisi ile arama yapmak.
1#!user/local/bin/perl 2 3$database = $ARGV[0]; 4$dir = $ARGV[1]; #directory for sequences 5$sp = $ARGV[2]; #starting point 6$n = $ARGV[3] + $sp; 7 8while (1) { 9 system("blastplus -programname=megablast $dir/read_$sp.seq $database -OUTFILE=read_$sp.megablast -nobatch -d"); 10 $sp++; 11 last if ($sp == $n); 12} Burada her sey gercekten cok basit bir programlama ile isliyor.
Tag: nofilter
Blog
Birden Fazla Dizi Dosyalarindan MegaBLAST'i Calistirmak
Asagidaki scripti, pipeline’in MegaBLAST aramasini daha hizli yapabilmek icin dusundugumuz bir teknige uygun olabilmesi icin yazdim. Yaptigi sey, her okuma icin olusturulmus ve formatlanmis dizi dosyalarini kullanarak veritabanlarinda belirtilen baslangic noktasi ve okuma sayisi ile arama yapmak.
1#!user/local/bin/perl 2 3$database = $ARGV[0]; 4$dir = $ARGV[1]; #directory for sequences 5$sp = $ARGV[2]; #starting point 6$n = $ARGV[3] + $sp; 7 8while (1) { 9 system("blastplus -programname=megablast $dir/read_$sp.seq $database -OUTFILE=read_$sp.megablast -nobatch -d"); 10 $sp++; 11 last if ($sp == $n); 12} Burada her sey gercekten cok basit bir programlama ile isliyor.
Tag: repeating sequences
Blog
Birden Fazla Dizi Dosyalarindan MegaBLAST'i Calistirmak
Asagidaki scripti, pipeline’in MegaBLAST aramasini daha hizli yapabilmek icin dusundugumuz bir teknige uygun olabilmesi icin yazdim. Yaptigi sey, her okuma icin olusturulmus ve formatlanmis dizi dosyalarini kullanarak veritabanlarinda belirtilen baslangic noktasi ve okuma sayisi ile arama yapmak.
1#!user/local/bin/perl 2 3$database = $ARGV[0]; 4$dir = $ARGV[1]; #directory for sequences 5$sp = $ARGV[2]; #starting point 6$n = $ARGV[3] + $sp; 7 8while (1) { 9 system("blastplus -programname=megablast $dir/read_$sp.seq $database -OUTFILE=read_$sp.megablast -nobatch -d"); 10 $sp++; 11 last if ($sp == $n); 12} Burada her sey gercekten cok basit bir programlama ile isliyor.
Tag: tekrar eden diziler
Blog
Birden Fazla Dizi Dosyalarindan MegaBLAST'i Calistirmak
Asagidaki scripti, pipeline’in MegaBLAST aramasini daha hizli yapabilmek icin dusundugumuz bir teknige uygun olabilmesi icin yazdim. Yaptigi sey, her okuma icin olusturulmus ve formatlanmis dizi dosyalarini kullanarak veritabanlarinda belirtilen baslangic noktasi ve okuma sayisi ile arama yapmak.
1#!user/local/bin/perl 2 3$database = $ARGV[0]; 4$dir = $ARGV[1]; #directory for sequences 5$sp = $ARGV[2]; #starting point 6$n = $ARGV[3] + $sp; 7 8while (1) { 9 system("blastplus -programname=megablast $dir/read_$sp.seq $database -OUTFILE=read_$sp.megablast -nobatch -d"); 10 $sp++; 11 last if ($sp == $n); 12} Burada her sey gercekten cok basit bir programlama ile isliyor.
Tag: while
Blog
Birden Fazla Dizi Dosyalarindan MegaBLAST'i Calistirmak
Asagidaki scripti, pipeline’in MegaBLAST aramasini daha hizli yapabilmek icin dusundugumuz bir teknige uygun olabilmesi icin yazdim. Yaptigi sey, her okuma icin olusturulmus ve formatlanmis dizi dosyalarini kullanarak veritabanlarinda belirtilen baslangic noktasi ve okuma sayisi ile arama yapmak.
1#!user/local/bin/perl 2 3$database = $ARGV[0]; 4$dir = $ARGV[1]; #directory for sequences 5$sp = $ARGV[2]; #starting point 6$n = $ARGV[3] + $sp; 7 8while (1) { 9 system("blastplus -programname=megablast $dir/read_$sp.seq $database -OUTFILE=read_$sp.megablast -nobatch -d"); 10 $sp++; 11 last if ($sp == $n); 12} Burada her sey gercekten cok basit bir programlama ile isliyor.
Blog
FASTQ'dan FASTA'ya Donusturme Perl Scripti
FASTQ ve FASTA formatlari aslinda ayni bilgiyi iceren ancak birinde sadece herbir dizi icin iki satir daha az bilginin bulundugu dosya formatlari. Projemde onemli olan diger bir farklari ise FASTA formatinin direkt olarak MegaBLAST arama yapilabilmesi. Iste bu yuzden, genetik dizilim yapan makinelerin olusturdugu FASTQ formatini FASTA’ya cevirmem gerekiyor. Ve bu script pipeline’in ilk adimi.
Aslinda deneme amacli aldigim genetk dizilimin, bana bunu ulastiran tarafindan eslestirmesinin yapilmadigi icin, bir on adim olarak bu eslestirmeyi yapmistim.
Tag: düzenli ifadeler
Blog
Tek FASTA Dosyasindan MegaBLAST'i Calistirmak - Duzenli Ifadeler
Asagida MegaBLAST’i FASTA dosyasi okuyarak calistirmak ve sonuclari bir dizinde toplayabilmek amaciyla yazdigim Perl scripti ve onun aciklamasi var. Bu script tasarlamakta oldugum pipeline’in onemli bir parcasi. Bu script ilk yazdigim olan ve sadece bir FASTA dosyasi uzerinden tum okumalara ulasabilen script.
1#!user/local/bin/perl 2$database = $ARGV[0]; 3$fasta = $ARGV[1]; #input file 4$sp = $ARGV[2]; #starting point 5$n = $ARGV[3] + $sp; 6 7if(!defined($n)){$n=12;} #set default number 8 9open FASTA, $fasta or die $!
Blog
Unix'te Perl Ile Bir Komut Ciktisini Okumak ve Duzenli Ifadeler
Daha once organizma isimlerini duzenli ifadelerle nasil cikardigimi anlatmistim. Burada, gene benzer bir seyden bahsedecegim ancak bu biraz daha fazla, ozel bir teknikle Perl’de yapilan, veri tabanindan bilgileri birden fazla satir halinde cikti olarak aldigim icin gerek duydugum cok yararli bir yontem. Mutlaka benzerini baska amaclarla da kullanabilir, yararlanabilirsiniz.
Bu ihtiyac, HUSAR gurubu tarafindan olusturulan honest veritabaninin organizma isimlerini direkt sunmamasi ancak birkac satir halinde gostermesi sebebiyle dogdu. Asagida bunun ornegini gorebilirsiniz.
Blog
Duzenli Ifadeler ile Tur Ismini Elde Etmek
Projemin sonunda kullaniciya olasi kirleten organizmalarin adlarini (Latince tur isimleri) gosterecegim icin, MegaBLAST sonuclarindaki erisim numaralarini (accession number) kullanarak her dizi icin organizma adlarini elde etmem gerekiyor. Sequence Retrival System (SRS) adinda, HUSAR sunucularinda bulunan baska bir sistem ile bunu yapabiliyorum.
SRS’ten organizma adini ogrenebilmem icin Unix komut satirinda “getz” komutuyla birlikte veritabani ismi, erisim numarasi ve ogrenmek istedigim alani yazmam yetiyor. Asagida, bu isi yapabilen ornek bir kod bulabilirsiniz.
Tag: regular expressions
Blog
Tek FASTA Dosyasindan MegaBLAST'i Calistirmak - Duzenli Ifadeler
Asagida MegaBLAST’i FASTA dosyasi okuyarak calistirmak ve sonuclari bir dizinde toplayabilmek amaciyla yazdigim Perl scripti ve onun aciklamasi var. Bu script tasarlamakta oldugum pipeline’in onemli bir parcasi. Bu script ilk yazdigim olan ve sadece bir FASTA dosyasi uzerinden tum okumalara ulasabilen script.
1#!user/local/bin/perl 2$database = $ARGV[0]; 3$fasta = $ARGV[1]; #input file 4$sp = $ARGV[2]; #starting point 5$n = $ARGV[3] + $sp; 6 7if(!defined($n)){$n=12;} #set default number 8 9open FASTA, $fasta or die $!
Blog
Duzenli Ifadeler ile Tur Ismini Elde Etmek
Projemin sonunda kullaniciya olasi kirleten organizmalarin adlarini (Latince tur isimleri) gosterecegim icin, MegaBLAST sonuclarindaki erisim numaralarini (accession number) kullanarak her dizi icin organizma adlarini elde etmem gerekiyor. Sequence Retrival System (SRS) adinda, HUSAR sunucularinda bulunan baska bir sistem ile bunu yapabiliyorum.
SRS’ten organizma adini ogrenebilmem icin Unix komut satirinda “getz” komutuyla birlikte veritabani ismi, erisim numarasi ve ogrenmek istedigim alani yazmam yetiyor. Asagida, bu isi yapabilen ornek bir kod bulabilirsiniz.
Tag: husar
Blog
Duzenli Ifadeler ile Tur Ismini Elde Etmek
Projemin sonunda kullaniciya olasi kirleten organizmalarin adlarini (Latince tur isimleri) gosterecegim icin, MegaBLAST sonuclarindaki erisim numaralarini (accession number) kullanarak her dizi icin organizma adlarini elde etmem gerekiyor. Sequence Retrival System (SRS) adinda, HUSAR sunucularinda bulunan baska bir sistem ile bunu yapabiliyorum.
SRS’ten organizma adini ogrenebilmem icin Unix komut satirinda “getz” komutuyla birlikte veritabani ismi, erisim numarasi ve ogrenmek istedigim alani yazmam yetiyor. Asagida, bu isi yapabilen ornek bir kod bulabilirsiniz.
Blog
SAM Dosyası - BAM Dosyası - samtools
Aslında programlamam gereken pipeline direkt olarak eşleşmeyen okumalar üzerinden analizler yapacak. Ancak böyle bir veri bulamadiığım için, elimdeki tek veri eşleşen ve eşleşmeyen okumaları içerdiği için önce eşleşenlerden kurtulmam gerekti.
Bunu daha önce de belirttiğim gibi bwa eşleştiricisi (aligner - mapper) ile yapıyorum. bwa bir dizi işlemden sonra SAM dosyası oluşturuyor ancak benim FASTQ dosyasına ihtiyacım var. Bunun için SAM dosyasını samtools1 ile benzer bir format olan BAM dosyasına çevirip, daha sonra da bam2fastq2 aracı ile FASTQ dosyamı elde edeceğim.
Blog
MegaBLAST - Dizilerdeki Benzerlikleri Bulma Aracı
MegaBLAST, HUSAR paketinde bulunan, BLAST (Basic Local Alignment Search Tool) paketinin bir parçası. Ayrıca BLASTN’in bir değişik türü. MegaBLAST uzun dizileri BLASTN’den daha etkili bir şekilde işliyor ve hem de çok daha hızlı işlem yapiyor ancak daha az duyarlı. Bu yüzden benzer dizileri geniş veri tabanlarında aramaya çok uygun bir araç.
Yazacağım program çoklu dizilim barındıran FASTA dosyasını alacak ve megablast komutunu çalıştıracak. Daha sonra da her okuma için bir .
Blog
Kontaminant (Kirletici) Analizi Projesi
Başlangıç olarak, araçlara, programlama diline, kısacası biyoenformatiğe alışabilmem için bana verilen bu ufak projeyi ayrıntılı olarak anlatacağım.
Biliyoruz ki, laboratuvar çalışmalarımızda ne kadar önlemeye çalışsak da kontaminant riski hep bulunuyor. Bunu ne kadar aza indirsek o kadar iyi, ki daha sonra bunun miktarını bulup, bunun üzerinden sonucumuzun bir başka değerlendirmesini de yapabiliriz. İşte bunu bulmak için bir yöntem, DNA analizi. Çalıştığınız örneğinizin DNA’sı dizileniyor ve bu DNA çeşitli programlarla analiz edilip, kirleten organizmaları DNA’larından ortaya çıkarabiliyoruz
Blog
FASTQ Formatı - FASTQ Dosyası
Bugün programı oluştururken kullanacağım “test” dizilimini aldım. İki adet FASTQ dosyasından oluşuyor, her biri sıkıştırılmış ama buna rağmen boyutları 6 GB civarı. Ben elbette çok zaman kaybetmek istemediğim için bu dosyalardan birinin sadece bir kısmını kullanacağım.
Amacım, bu FASTQ dosyalarındaki eşleşebilen okumaları BWA aracı ile bularak, daha sonra onları çıkarmak. Ve kalan eşleşemeyen okumaları MegaBLAST aracının anlayabileceği bir dilde (FASTA formatında) kaydetmek.
Bu arada tüm projeyi bir Unix bilgisayarda hazırladığım için birçok komut öğreniyorum, daha sonra bunları ayrıca yazmaya çalışacağım.
Blog
Dizileme Çalışmalarını Kirleten Organizmaları Tespit Etme
Bu yaz stajımda ilk olarak başlayacağım çalışma yavaş yavaş şekilleniyor. Bu çalışmada bir pipeline oluşturup, bunu laboratuvarlarda dizileme (sequencing) örneklerini kirleten organizmaları bulmaya çalışacağım.
Laboratuvarlarda birçok nedenden dolayı örnekler başka organizmalar ya da yabancı DNA tarafından kirlenebiliyor. Bunlar bakteri, maya olabilir ya da bir virüs DNA’sı da olabilir. Siz bir DNA’yı diziledikten sonra onun referansıyla eşleştirme çok az oranda çıkabiliyor. Bu da yabancı DNA’nın olabileceğini gösteriyor. Bir başka neden referans DNA’nın farklı olması da olabilir.
Blog
Pipeline ve Pipeline Geliştirme
Bugün aldığım tanıtım derslerinin devamında, pipeline ve pipeline geliştirme ile ilgili ayrıntılı bilgiler aldım. Pipeline, aslında bildiğimiz boru hattı demek, örneğin borularla petrolün bir yerden başka bir yere taşınması için kullanılan sistem. Bunun bilgisayar terminolojisinde anlamı ise bir elementin çıktısı, diğerinin girdisi olacak şekilde oluşturulmuş işleme elementleri zinciri. Böylece çok daha komplike işlemler pipeline oluşturularak, kolay ve düzenli bir biçimde gerçekleştiriliyor. Sanırım pipeline Türkçeye ardışık düzen olarak çevriliyor, gene de ben pipeline olarak kullanacağım.
Blog
WWW2HUSAR - HUSAR'ın Web Arayüzü
Stajımın ikinci gününde HUSAR’ın web arayüzünü konuştuk. HUSAR komut isteminden komutlarla kullanılabilen, yönetilebilen bir yazılım ancak bunu kolaylaştırmak için hazırlanmış bir web arayüzü var. WWW2HUSAR adını verdikleri bu arayüz ile listelenen araçları kolayca seçebiliyor, genetik dizinizi ekleyebiliyor ve başka birçok işlemi kolayca, birkaç tık ile yapabiliyorsunuz.
Bununla birlikte biraz daha HUSAR’ın işlevlerine göz attık. Yazılımda, yerel klasörde gen dizisi listeleri oluşturarak, bunları çoklu dizi hizalama (multiple sequence alignment) aracı ile genlerin benzerliklerini karşılastırabiliyor ve örneğin evrimsel ilişkilerini ortaya çıkarabiliyorsunuz.
Blog
DKFZ - Heidelberg Biyoenformatik Birimi'nde Staj
Erasmus programıyla yapıyor olduğum yaz stajı başladı. İlk olarak birimi yöneten bilim insanlarından birkaç saatlik tanıtım dersi aldım. Bu derste birimin kısa tarihi, birimin günümüze kadar yaptıkları projeler ve bunlarin ayrintilari konusunda bilgiler aldım.
Biyoenformatik Birimi DKFZ’nin (Deutsches Krebsforschungszentrum – ing. German Cancer Research Center) bir çekirdek tesisi olan Genomik ve Proteomik Çekirdek Tesisi’ne bağlı bir grup. İsimleri aynı zamanda HUSAR (Heidelberg Unix Sequence Analysis Resources) ve bu isim grubun geliştirdiği dizi analizi yapma paketinin de adı olarak kullanılıyor.
Tag: reference sequence
Blog
Duzenli Ifadeler ile Tur Ismini Elde Etmek
Projemin sonunda kullaniciya olasi kirleten organizmalarin adlarini (Latince tur isimleri) gosterecegim icin, MegaBLAST sonuclarindaki erisim numaralarini (accession number) kullanarak her dizi icin organizma adlarini elde etmem gerekiyor. Sequence Retrival System (SRS) adinda, HUSAR sunucularinda bulunan baska bir sistem ile bunu yapabiliyorum.
SRS’ten organizma adini ogrenebilmem icin Unix komut satirinda “getz” komutuyla birlikte veritabani ismi, erisim numarasi ve ogrenmek istedigim alani yazmam yetiyor. Asagida, bu isi yapabilen ornek bir kod bulabilirsiniz.
Blog
Veritabani Secimi
Bu projedeki amacim olasi kirleten organizmalari (kontaminantlari) bulmak. Dolayisiyla genis bir veritabanina ihtiyacim var. Ancak veritabanini genis tutmak boyle bir avantaj sagliyorken, her dizi icin o veritabaninda arama yapmak oldukca fazla bilgisayar gucu ve zaman gerektiriyor. Bu yuzden projemi gelistirirken, cesitli veritabanlarini da inceliyorum. Ve ayrica bunlari nasil kisitlayarak, amacim icin en uygun hale getirebilecegimi arastiriyorum.
Ilk olarak NCBI’in Reference Sequence (Kaynak Dizi ya da Referans Sekans) – RefSeq – veritabaniyla basladim.
Tag: sequence retrival system
Blog
Duzenli Ifadeler ile Tur Ismini Elde Etmek
Projemin sonunda kullaniciya olasi kirleten organizmalarin adlarini (Latince tur isimleri) gosterecegim icin, MegaBLAST sonuclarindaki erisim numaralarini (accession number) kullanarak her dizi icin organizma adlarini elde etmem gerekiyor. Sequence Retrival System (SRS) adinda, HUSAR sunucularinda bulunan baska bir sistem ile bunu yapabiliyorum.
SRS’ten organizma adini ogrenebilmem icin Unix komut satirinda “getz” komutuyla birlikte veritabani ismi, erisim numarasi ve ogrenmek istedigim alani yazmam yetiyor. Asagida, bu isi yapabilen ornek bir kod bulabilirsiniz.
Blog
DKFZ - Heidelberg Biyoenformatik Birimi'nde Staj
Erasmus programıyla yapıyor olduğum yaz stajı başladı. İlk olarak birimi yöneten bilim insanlarından birkaç saatlik tanıtım dersi aldım. Bu derste birimin kısa tarihi, birimin günümüze kadar yaptıkları projeler ve bunlarin ayrintilari konusunda bilgiler aldım.
Biyoenformatik Birimi DKFZ’nin (Deutsches Krebsforschungszentrum – ing. German Cancer Research Center) bir çekirdek tesisi olan Genomik ve Proteomik Çekirdek Tesisi’ne bağlı bir grup. İsimleri aynı zamanda HUSAR (Heidelberg Unix Sequence Analysis Resources) ve bu isim grubun geliştirdiği dizi analizi yapma paketinin de adı olarak kullanılıyor.
Tag: e value
Blog
Bir MegaBLAST Ciktisi Icerigi - RefSeq Veritabani
Asagida, deneme FASTA dosyasini refseq_genomic veritabaninda arayarak elde ettigim dosyadan, bir hitin ayrintilarini goruyoruz.
>>>>refseq_genomic_complete3: AC_000033_0310 Continuation (311 of 1357) of AC_000033 from base 31000001 (AC_000033 Mus musculus strain mixed chromosome 11, alternate assembly Mm_Celera, whole genome shotgun sequence. 2/2012) Length = 110000 Score = 115 bits (58), Expect = 4e-22 Identities = 74/79 (93%), Gaps = 2/79 (2%) Strand = Plus / Minus Query: 1 ctctctctgtct-tctctctctctctgtctctctctctttctctctcttctctctctctc 59 |||||||||||| ||| ||||||||| ||||||||||| ||||||||||||||||||||| Sbjct: 89773 ctctctctgtctgtctttctctctctctctctctctctctctctctcttctctctctctc 89714 Query: 60 tttctctctgccctctctc 78 ||||||||| ||||||||| Sbjct: 89713 tttctctct-ccctctctc 89696 Ayrintilarda, ilk olarak >>>> karakterleriyle hit ile ilgili baslik bilgisi veriyor.
Tag: gaps
Blog
Bir MegaBLAST Ciktisi Icerigi - RefSeq Veritabani
Asagida, deneme FASTA dosyasini refseq_genomic veritabaninda arayarak elde ettigim dosyadan, bir hitin ayrintilarini goruyoruz.
>>>>refseq_genomic_complete3: AC_000033_0310 Continuation (311 of 1357) of AC_000033 from base 31000001 (AC_000033 Mus musculus strain mixed chromosome 11, alternate assembly Mm_Celera, whole genome shotgun sequence. 2/2012) Length = 110000 Score = 115 bits (58), Expect = 4e-22 Identities = 74/79 (93%), Gaps = 2/79 (2%) Strand = Plus / Minus Query: 1 ctctctctgtct-tctctctctctctgtctctctctctttctctctcttctctctctctc 59 |||||||||||| ||| ||||||||| ||||||||||| ||||||||||||||||||||| Sbjct: 89773 ctctctctgtctgtctttctctctctctctctctctctctctctctcttctctctctctc 89714 Query: 60 tttctctctgccctctctc 78 ||||||||| ||||||||| Sbjct: 89713 tttctctct-ccctctctc 89696 Ayrintilarda, ilk olarak >>>> karakterleriyle hit ile ilgili baslik bilgisi veriyor.
Tag: identity
Blog
Bir MegaBLAST Ciktisi Icerigi - RefSeq Veritabani
Asagida, deneme FASTA dosyasini refseq_genomic veritabaninda arayarak elde ettigim dosyadan, bir hitin ayrintilarini goruyoruz.
>>>>refseq_genomic_complete3: AC_000033_0310 Continuation (311 of 1357) of AC_000033 from base 31000001 (AC_000033 Mus musculus strain mixed chromosome 11, alternate assembly Mm_Celera, whole genome shotgun sequence. 2/2012) Length = 110000 Score = 115 bits (58), Expect = 4e-22 Identities = 74/79 (93%), Gaps = 2/79 (2%) Strand = Plus / Minus Query: 1 ctctctctgtct-tctctctctctctgtctctctctctttctctctcttctctctctctc 59 |||||||||||| ||| ||||||||| ||||||||||| ||||||||||||||||||||| Sbjct: 89773 ctctctctgtctgtctttctctctctctctctctctctctctctctcttctctctctctc 89714 Query: 60 tttctctctgccctctctc 78 ||||||||| ||||||||| Sbjct: 89713 tttctctct-ccctctctc 89696 Ayrintilarda, ilk olarak >>>> karakterleriyle hit ile ilgili baslik bilgisi veriyor.
Tag: megablast output
Blog
Bir MegaBLAST Ciktisi Icerigi - RefSeq Veritabani
Asagida, deneme FASTA dosyasini refseq_genomic veritabaninda arayarak elde ettigim dosyadan, bir hitin ayrintilarini goruyoruz.
>>>>refseq_genomic_complete3: AC_000033_0310 Continuation (311 of 1357) of AC_000033 from base 31000001 (AC_000033 Mus musculus strain mixed chromosome 11, alternate assembly Mm_Celera, whole genome shotgun sequence. 2/2012) Length = 110000 Score = 115 bits (58), Expect = 4e-22 Identities = 74/79 (93%), Gaps = 2/79 (2%) Strand = Plus / Minus Query: 1 ctctctctgtct-tctctctctctctgtctctctctctttctctctcttctctctctctc 59 |||||||||||| ||| ||||||||| ||||||||||| ||||||||||||||||||||| Sbjct: 89773 ctctctctgtctgtctttctctctctctctctctctctctctctctcttctctctctctc 89714 Query: 60 tttctctctgccctctctc 78 ||||||||| ||||||||| Sbjct: 89713 tttctctct-ccctctctc 89696 Ayrintilarda, ilk olarak >>>> karakterleriyle hit ile ilgili baslik bilgisi veriyor.
Tag: score
Blog
Bir MegaBLAST Ciktisi Icerigi - RefSeq Veritabani
Asagida, deneme FASTA dosyasini refseq_genomic veritabaninda arayarak elde ettigim dosyadan, bir hitin ayrintilarini goruyoruz.
>>>>refseq_genomic_complete3: AC_000033_0310 Continuation (311 of 1357) of AC_000033 from base 31000001 (AC_000033 Mus musculus strain mixed chromosome 11, alternate assembly Mm_Celera, whole genome shotgun sequence. 2/2012) Length = 110000 Score = 115 bits (58), Expect = 4e-22 Identities = 74/79 (93%), Gaps = 2/79 (2%) Strand = Plus / Minus Query: 1 ctctctctgtct-tctctctctctctgtctctctctctttctctctcttctctctctctc 59 |||||||||||| ||| ||||||||| ||||||||||| ||||||||||||||||||||| Sbjct: 89773 ctctctctgtctgtctttctctctctctctctctctctctctctctcttctctctctctc 89714 Query: 60 tttctctctgccctctctc 78 ||||||||| ||||||||| Sbjct: 89713 tttctctct-ccctctctc 89696 Ayrintilarda, ilk olarak >>>> karakterleriyle hit ile ilgili baslik bilgisi veriyor.
Tag: strand
Blog
Bir MegaBLAST Ciktisi Icerigi - RefSeq Veritabani
Asagida, deneme FASTA dosyasini refseq_genomic veritabaninda arayarak elde ettigim dosyadan, bir hitin ayrintilarini goruyoruz.
>>>>refseq_genomic_complete3: AC_000033_0310 Continuation (311 of 1357) of AC_000033 from base 31000001 (AC_000033 Mus musculus strain mixed chromosome 11, alternate assembly Mm_Celera, whole genome shotgun sequence. 2/2012) Length = 110000 Score = 115 bits (58), Expect = 4e-22 Identities = 74/79 (93%), Gaps = 2/79 (2%) Strand = Plus / Minus Query: 1 ctctctctgtct-tctctctctctctgtctctctctctttctctctcttctctctctctc 59 |||||||||||| ||| ||||||||| ||||||||||| ||||||||||||||||||||| Sbjct: 89773 ctctctctgtctgtctttctctctctctctctctctctctctctctcttctctctctctc 89714 Query: 60 tttctctctgccctctctc 78 ||||||||| ||||||||| Sbjct: 89713 tttctctct-ccctctctc 89696 Ayrintilarda, ilk olarak >>>> karakterleriyle hit ile ilgili baslik bilgisi veriyor.
Tag: cpu runtime
Blog
Veritabanina Gore Bir Komutun Calisma Suresi - CPU Runtime
Calisilan dosyalar, veritabanları buyuk olunca ve yeterince bilgisayar gucune sahip olmayınca, her seyden once olcmemiz gereken nasil en etkili ve kisa surede sonucu alabiliyor olmamizdir.
Özellikle projemde, farkli veritabanları ve farkli parametreler kullanarak, bunları arastiriyorum.
Şimdilik dort veritabani deniyorum, bunlar: nrnuc, ensembl_cdna, honest ve refseq_genomic. Ayrica, bunu farkli iki kelime uzunluğuna gore de yapacagim. Kelime uzunluğu (word size) MegaBLAST’in ararken tam olarak eslestirecegi baz cifti sayisi. Yani elimde 151 baz ciftine sahip bir dizilim varsa, ve eger kelime uzunluğu 50 olarak belirlenmişse, bu 151 baz cifti icinden herhangi bir yerden baslayan ama arka arkaya en az 50 bazin dizilendiği kisimlar aranacak.
Tag: ensembl_cdna
Blog
Veritabanina Gore Bir Komutun Calisma Suresi - CPU Runtime
Calisilan dosyalar, veritabanları buyuk olunca ve yeterince bilgisayar gucune sahip olmayınca, her seyden once olcmemiz gereken nasil en etkili ve kisa surede sonucu alabiliyor olmamizdir.
Özellikle projemde, farkli veritabanları ve farkli parametreler kullanarak, bunları arastiriyorum.
Şimdilik dort veritabani deniyorum, bunlar: nrnuc, ensembl_cdna, honest ve refseq_genomic. Ayrica, bunu farkli iki kelime uzunluğuna gore de yapacagim. Kelime uzunluğu (word size) MegaBLAST’in ararken tam olarak eslestirecegi baz cifti sayisi. Yani elimde 151 baz ciftine sahip bir dizilim varsa, ve eger kelime uzunluğu 50 olarak belirlenmişse, bu 151 baz cifti icinden herhangi bir yerden baslayan ama arka arkaya en az 50 bazin dizilendiği kisimlar aranacak.
Blog
Veritabani Secimi
Bu projedeki amacim olasi kirleten organizmalari (kontaminantlari) bulmak. Dolayisiyla genis bir veritabanina ihtiyacim var. Ancak veritabanini genis tutmak boyle bir avantaj sagliyorken, her dizi icin o veritabaninda arama yapmak oldukca fazla bilgisayar gucu ve zaman gerektiriyor. Bu yuzden projemi gelistirirken, cesitli veritabanlarini da inceliyorum. Ve ayrica bunlari nasil kisitlayarak, amacim icin en uygun hale getirebilecegimi arastiriyorum.
Ilk olarak NCBI’in Reference Sequence (Kaynak Dizi ya da Referans Sekans) – RefSeq – veritabaniyla basladim.
Tag: honest
Blog
Veritabanina Gore Bir Komutun Calisma Suresi - CPU Runtime
Calisilan dosyalar, veritabanları buyuk olunca ve yeterince bilgisayar gucune sahip olmayınca, her seyden once olcmemiz gereken nasil en etkili ve kisa surede sonucu alabiliyor olmamizdir.
Özellikle projemde, farkli veritabanları ve farkli parametreler kullanarak, bunları arastiriyorum.
Şimdilik dort veritabani deniyorum, bunlar: nrnuc, ensembl_cdna, honest ve refseq_genomic. Ayrica, bunu farkli iki kelime uzunluğuna gore de yapacagim. Kelime uzunluğu (word size) MegaBLAST’in ararken tam olarak eslestirecegi baz cifti sayisi. Yani elimde 151 baz ciftine sahip bir dizilim varsa, ve eger kelime uzunluğu 50 olarak belirlenmişse, bu 151 baz cifti icinden herhangi bir yerden baslayan ama arka arkaya en az 50 bazin dizilendiği kisimlar aranacak.
Blog
Veritabani Secimi
Bu projedeki amacim olasi kirleten organizmalari (kontaminantlari) bulmak. Dolayisiyla genis bir veritabanina ihtiyacim var. Ancak veritabanini genis tutmak boyle bir avantaj sagliyorken, her dizi icin o veritabaninda arama yapmak oldukca fazla bilgisayar gucu ve zaman gerektiriyor. Bu yuzden projemi gelistirirken, cesitli veritabanlarini da inceliyorum. Ve ayrica bunlari nasil kisitlayarak, amacim icin en uygun hale getirebilecegimi arastiriyorum.
Ilk olarak NCBI’in Reference Sequence (Kaynak Dizi ya da Referans Sekans) – RefSeq – veritabaniyla basladim.
Tag: nrnuc
Blog
Veritabanina Gore Bir Komutun Calisma Suresi - CPU Runtime
Calisilan dosyalar, veritabanları buyuk olunca ve yeterince bilgisayar gucune sahip olmayınca, her seyden once olcmemiz gereken nasil en etkili ve kisa surede sonucu alabiliyor olmamizdir.
Özellikle projemde, farkli veritabanları ve farkli parametreler kullanarak, bunları arastiriyorum.
Şimdilik dort veritabani deniyorum, bunlar: nrnuc, ensembl_cdna, honest ve refseq_genomic. Ayrica, bunu farkli iki kelime uzunluğuna gore de yapacagim. Kelime uzunluğu (word size) MegaBLAST’in ararken tam olarak eslestirecegi baz cifti sayisi. Yani elimde 151 baz ciftine sahip bir dizilim varsa, ve eger kelime uzunluğu 50 olarak belirlenmişse, bu 151 baz cifti icinden herhangi bir yerden baslayan ama arka arkaya en az 50 bazin dizilendiği kisimlar aranacak.
Tag: fastq fasta conversion
Blog
FASTQ'dan FASTA'ya Donusturme Perl Scripti
FASTQ ve FASTA formatlari aslinda ayni bilgiyi iceren ancak birinde sadece herbir dizi icin iki satir daha az bilginin bulundugu dosya formatlari. Projemde onemli olan diger bir farklari ise FASTA formatinin direkt olarak MegaBLAST arama yapilabilmesi. Iste bu yuzden, genetik dizilim yapan makinelerin olusturdugu FASTQ formatini FASTA’ya cevirmem gerekiyor. Ve bu script pipeline’in ilk adimi.
Aslinda deneme amacli aldigim genetk dizilimin, bana bunu ulastiran tarafindan eslestirmesinin yapilmadigi icin, bir on adim olarak bu eslestirmeyi yapmistim.
Tag: fastq fasta dönüştürmek
Blog
FASTQ'dan FASTA'ya Donusturme Perl Scripti
FASTQ ve FASTA formatlari aslinda ayni bilgiyi iceren ancak birinde sadece herbir dizi icin iki satir daha az bilginin bulundugu dosya formatlari. Projemde onemli olan diger bir farklari ise FASTA formatinin direkt olarak MegaBLAST arama yapilabilmesi. Iste bu yuzden, genetik dizilim yapan makinelerin olusturdugu FASTQ formatini FASTA’ya cevirmem gerekiyor. Ve bu script pipeline’in ilk adimi.
Aslinda deneme amacli aldigim genetk dizilimin, bana bunu ulastiran tarafindan eslestirmesinin yapilmadigi icin, bir on adim olarak bu eslestirmeyi yapmistim.
Tag: eşleşen okuma
Blog
Eşleştirme ve Eşleşmeyen Okumaları Çıkarma Sonuçları
Daha önce verinin sadece bir kısmı ile çalışıyordum ancak artık tamamıyla çalışacağım. Bu yüzden bana sıkıştırılmış halde gelen veriyi direkt çalışma klasörüme çıkardım ve onun üzerinden işlemler yaptım.
Başlangıç (FASTQ) dosyamın boyutu 2153988289 bayt (2 GB). Ve bwa aracılığıyla eşleştirmeden sonra toplamda 6004193 dizilim, ya da okuma, (sequences ya da reads) ortaya çıktı. Daha sonra eşleşmeyen okumaları çıkarmam sonrasında toplam okuma sayısı 551065 kadar azaldı ve 5493128 oldu. Yani verinin %9.
Blog
SAM Dosyası - BAM Dosyası - samtools
Aslında programlamam gereken pipeline direkt olarak eşleşmeyen okumalar üzerinden analizler yapacak. Ancak böyle bir veri bulamadiığım için, elimdeki tek veri eşleşen ve eşleşmeyen okumaları içerdiği için önce eşleşenlerden kurtulmam gerekti.
Bunu daha önce de belirttiğim gibi bwa eşleştiricisi (aligner - mapper) ile yapıyorum. bwa bir dizi işlemden sonra SAM dosyası oluşturuyor ancak benim FASTQ dosyasına ihtiyacım var. Bunun için SAM dosyasını samtools1 ile benzer bir format olan BAM dosyasına çevirip, daha sonra da bam2fastq2 aracı ile FASTQ dosyamı elde edeceğim.
Blog
İlk Adım: Eşleşmeyen Okumaları Elde Etmek
Projemin ilk kısmı daha önce bahsettiğim gibi eşleşmeyen okumaları (unmapped reads) FASTQ dosyasından çıkarmak. Böylece, daha sonraki analizler için elimdeki ihtiyacım olmayan dizileri çıkarmış ve bu analizlerdeki iş yükünü azaltmış oluyorum.
Başından beri hedefim, tüm projeyi adım adım gerçekleştiren bir pipeline tasarlamak olduğu için bu işlemi bir Perl scripti ile yapacağım. Bu script pipeline’in ilk scripti ve laboratuvardan gelecek ham (raw) FASTQ formatındaki verinin girdi (input) olarak kullanılacağı yer. Aslında bu scripte ihtiyacım olmayacak, sadece elimdeki verinin eşlenebilen verileri de içermesi sebebiyle bu adımı ekledim.
Blog
Dizileme Çalışmalarını Kirleten Organizmaları Tespit Etme
Bu yaz stajımda ilk olarak başlayacağım çalışma yavaş yavaş şekilleniyor. Bu çalışmada bir pipeline oluşturup, bunu laboratuvarlarda dizileme (sequencing) örneklerini kirleten organizmaları bulmaya çalışacağım.
Laboratuvarlarda birçok nedenden dolayı örnekler başka organizmalar ya da yabancı DNA tarafından kirlenebiliyor. Bunlar bakteri, maya olabilir ya da bir virüs DNA’sı da olabilir. Siz bir DNA’yı diziledikten sonra onun referansıyla eşleştirme çok az oranda çıkabiliyor. Bu da yabancı DNA’nın olabileceğini gösteriyor. Bir başka neden referans DNA’nın farklı olması da olabilir.
Tag: eşleşmeyen okumaları çıkarmak
Blog
Eşleştirme ve Eşleşmeyen Okumaları Çıkarma Sonuçları
Daha önce verinin sadece bir kısmı ile çalışıyordum ancak artık tamamıyla çalışacağım. Bu yüzden bana sıkıştırılmış halde gelen veriyi direkt çalışma klasörüme çıkardım ve onun üzerinden işlemler yaptım.
Başlangıç (FASTQ) dosyamın boyutu 2153988289 bayt (2 GB). Ve bwa aracılığıyla eşleştirmeden sonra toplamda 6004193 dizilim, ya da okuma, (sequences ya da reads) ortaya çıktı. Daha sonra eşleşmeyen okumaları çıkarmam sonrasında toplam okuma sayısı 551065 kadar azaldı ve 5493128 oldu. Yani verinin %9.
Blog
BWA İle Eşleştirme (Mapping - Alignment)
Bunu daha önce yazmayı unutmuşum. Aslında bahsetmiştim ancak nasıl yapıldığına dair bir şeyler yazmamışım ayrıca örnek komutlar da eklememişim.
BWA elimizdeki (FASTQ formatındaki) DNA dizilimini, referans genomunu (projemde bu insan genomu) alarak bir .sai dosyası oluşturuyor. Bu dosya dizinin ve referans genomunun eşleşmesi ile ilgili bilgiler taşiyor ve bu bilgileri kullanarak eşleşmeyenleri ayırabiliyorum.
İlk olarak aşağıdaki komut ile .sai dosyamızı oluşturuyoruz.
1bwa aln $NGSDATAROOT/bwa/human_genome37 ChIP_NoIndex_L001_R1_complete_filtered.fastq > complete_alignment.sai Oluşturduğumuz .sai dosyası çok da kullanışlı bir dosya değil, bu yüzden onu SAM dosyasına çevirerek, işlemlere devam ediyoruz.
Tag: alignment
Blog
BWA İle Eşleştirme (Mapping - Alignment)
Bunu daha önce yazmayı unutmuşum. Aslında bahsetmiştim ancak nasıl yapıldığına dair bir şeyler yazmamışım ayrıca örnek komutlar da eklememişim.
BWA elimizdeki (FASTQ formatındaki) DNA dizilimini, referans genomunu (projemde bu insan genomu) alarak bir .sai dosyası oluşturuyor. Bu dosya dizinin ve referans genomunun eşleşmesi ile ilgili bilgiler taşiyor ve bu bilgileri kullanarak eşleşmeyenleri ayırabiliyorum.
İlk olarak aşağıdaki komut ile .sai dosyamızı oluşturuyoruz.
1bwa aln $NGSDATAROOT/bwa/human_genome37 ChIP_NoIndex_L001_R1_complete_filtered.fastq > complete_alignment.sai Oluşturduğumuz .sai dosyası çok da kullanışlı bir dosya değil, bu yüzden onu SAM dosyasına çevirerek, işlemlere devam ediyoruz.
Tag: aln
Blog
BWA İle Eşleştirme (Mapping - Alignment)
Bunu daha önce yazmayı unutmuşum. Aslında bahsetmiştim ancak nasıl yapıldığına dair bir şeyler yazmamışım ayrıca örnek komutlar da eklememişim.
BWA elimizdeki (FASTQ formatındaki) DNA dizilimini, referans genomunu (projemde bu insan genomu) alarak bir .sai dosyası oluşturuyor. Bu dosya dizinin ve referans genomunun eşleşmesi ile ilgili bilgiler taşiyor ve bu bilgileri kullanarak eşleşmeyenleri ayırabiliyorum.
İlk olarak aşağıdaki komut ile .sai dosyamızı oluşturuyoruz.
1bwa aln $NGSDATAROOT/bwa/human_genome37 ChIP_NoIndex_L001_R1_complete_filtered.fastq > complete_alignment.sai Oluşturduğumuz .sai dosyası çok da kullanışlı bir dosya değil, bu yüzden onu SAM dosyasına çevirerek, işlemlere devam ediyoruz.
Tag: eşleştirme
Blog
BWA İle Eşleştirme (Mapping - Alignment)
Bunu daha önce yazmayı unutmuşum. Aslında bahsetmiştim ancak nasıl yapıldığına dair bir şeyler yazmamışım ayrıca örnek komutlar da eklememişim.
BWA elimizdeki (FASTQ formatındaki) DNA dizilimini, referans genomunu (projemde bu insan genomu) alarak bir .sai dosyası oluşturuyor. Bu dosya dizinin ve referans genomunun eşleşmesi ile ilgili bilgiler taşiyor ve bu bilgileri kullanarak eşleşmeyenleri ayırabiliyorum.
İlk olarak aşağıdaki komut ile .sai dosyamızı oluşturuyoruz.
1bwa aln $NGSDATAROOT/bwa/human_genome37 ChIP_NoIndex_L001_R1_complete_filtered.fastq > complete_alignment.sai Oluşturduğumuz .sai dosyası çok da kullanışlı bir dosya değil, bu yüzden onu SAM dosyasına çevirerek, işlemlere devam ediyoruz.
Tag: human genome
Blog
BWA İle Eşleştirme (Mapping - Alignment)
Bunu daha önce yazmayı unutmuşum. Aslında bahsetmiştim ancak nasıl yapıldığına dair bir şeyler yazmamışım ayrıca örnek komutlar da eklememişim.
BWA elimizdeki (FASTQ formatındaki) DNA dizilimini, referans genomunu (projemde bu insan genomu) alarak bir .sai dosyası oluşturuyor. Bu dosya dizinin ve referans genomunun eşleşmesi ile ilgili bilgiler taşiyor ve bu bilgileri kullanarak eşleşmeyenleri ayırabiliyorum.
İlk olarak aşağıdaki komut ile .sai dosyamızı oluşturuyoruz.
1bwa aln $NGSDATAROOT/bwa/human_genome37 ChIP_NoIndex_L001_R1_complete_filtered.fastq > complete_alignment.sai Oluşturduğumuz .sai dosyası çok da kullanışlı bir dosya değil, bu yüzden onu SAM dosyasına çevirerek, işlemlere devam ediyoruz.
Tag: mapping
Blog
BWA İle Eşleştirme (Mapping - Alignment)
Bunu daha önce yazmayı unutmuşum. Aslında bahsetmiştim ancak nasıl yapıldığına dair bir şeyler yazmamışım ayrıca örnek komutlar da eklememişim.
BWA elimizdeki (FASTQ formatındaki) DNA dizilimini, referans genomunu (projemde bu insan genomu) alarak bir .sai dosyası oluşturuyor. Bu dosya dizinin ve referans genomunun eşleşmesi ile ilgili bilgiler taşiyor ve bu bilgileri kullanarak eşleşmeyenleri ayırabiliyorum.
İlk olarak aşağıdaki komut ile .sai dosyamızı oluşturuyoruz.
1bwa aln $NGSDATAROOT/bwa/human_genome37 ChIP_NoIndex_L001_R1_complete_filtered.fastq > complete_alignment.sai Oluşturduğumuz .sai dosyası çok da kullanışlı bir dosya değil, bu yüzden onu SAM dosyasına çevirerek, işlemlere devam ediyoruz.
Tag: sai
Blog
BWA İle Eşleştirme (Mapping - Alignment)
Bunu daha önce yazmayı unutmuşum. Aslında bahsetmiştim ancak nasıl yapıldığına dair bir şeyler yazmamışım ayrıca örnek komutlar da eklememişim.
BWA elimizdeki (FASTQ formatındaki) DNA dizilimini, referans genomunu (projemde bu insan genomu) alarak bir .sai dosyası oluşturuyor. Bu dosya dizinin ve referans genomunun eşleşmesi ile ilgili bilgiler taşiyor ve bu bilgileri kullanarak eşleşmeyenleri ayırabiliyorum.
İlk olarak aşağıdaki komut ile .sai dosyamızı oluşturuyoruz.
1bwa aln $NGSDATAROOT/bwa/human_genome37 ChIP_NoIndex_L001_R1_complete_filtered.fastq > complete_alignment.sai Oluşturduğumuz .sai dosyası çok da kullanışlı bir dosya değil, bu yüzden onu SAM dosyasına çevirerek, işlemlere devam ediyoruz.
Tag: samse
Blog
BWA İle Eşleştirme (Mapping - Alignment)
Bunu daha önce yazmayı unutmuşum. Aslında bahsetmiştim ancak nasıl yapıldığına dair bir şeyler yazmamışım ayrıca örnek komutlar da eklememişim.
BWA elimizdeki (FASTQ formatındaki) DNA dizilimini, referans genomunu (projemde bu insan genomu) alarak bir .sai dosyası oluşturuyor. Bu dosya dizinin ve referans genomunun eşleşmesi ile ilgili bilgiler taşiyor ve bu bilgileri kullanarak eşleşmeyenleri ayırabiliyorum.
İlk olarak aşağıdaki komut ile .sai dosyamızı oluşturuyoruz.
1bwa aln $NGSDATAROOT/bwa/human_genome37 ChIP_NoIndex_L001_R1_complete_filtered.fastq > complete_alignment.sai Oluşturduğumuz .sai dosyası çok da kullanışlı bir dosya değil, bu yüzden onu SAM dosyasına çevirerek, işlemlere devam ediyoruz.
Tag: samtools
Blog
SAM Dosyası - BAM Dosyası - samtools
Aslında programlamam gereken pipeline direkt olarak eşleşmeyen okumalar üzerinden analizler yapacak. Ancak böyle bir veri bulamadiığım için, elimdeki tek veri eşleşen ve eşleşmeyen okumaları içerdiği için önce eşleşenlerden kurtulmam gerekti.
Bunu daha önce de belirttiğim gibi bwa eşleştiricisi (aligner - mapper) ile yapıyorum. bwa bir dizi işlemden sonra SAM dosyası oluşturuyor ancak benim FASTQ dosyasına ihtiyacım var. Bunun için SAM dosyasını samtools1 ile benzer bir format olan BAM dosyasına çevirip, daha sonra da bam2fastq2 aracı ile FASTQ dosyamı elde edeceğim.
Tag: samtools view
Blog
SAM Dosyası - BAM Dosyası - samtools
Aslında programlamam gereken pipeline direkt olarak eşleşmeyen okumalar üzerinden analizler yapacak. Ancak böyle bir veri bulamadiığım için, elimdeki tek veri eşleşen ve eşleşmeyen okumaları içerdiği için önce eşleşenlerden kurtulmam gerekti.
Bunu daha önce de belirttiğim gibi bwa eşleştiricisi (aligner - mapper) ile yapıyorum. bwa bir dizi işlemden sonra SAM dosyası oluşturuyor ancak benim FASTQ dosyasına ihtiyacım var. Bunun için SAM dosyasını samtools1 ile benzer bir format olan BAM dosyasına çevirip, daha sonra da bam2fastq2 aracı ile FASTQ dosyamı elde edeceğim.
Tag: çift-sonlu okuma
Blog
İlk Adım: Eşleşmeyen Okumaları Elde Etmek
Projemin ilk kısmı daha önce bahsettiğim gibi eşleşmeyen okumaları (unmapped reads) FASTQ dosyasından çıkarmak. Böylece, daha sonraki analizler için elimdeki ihtiyacım olmayan dizileri çıkarmış ve bu analizlerdeki iş yükünü azaltmış oluyorum.
Başından beri hedefim, tüm projeyi adım adım gerçekleştiren bir pipeline tasarlamak olduğu için bu işlemi bir Perl scripti ile yapacağım. Bu script pipeline’in ilk scripti ve laboratuvardan gelecek ham (raw) FASTQ formatındaki verinin girdi (input) olarak kullanılacağı yer. Aslında bu scripte ihtiyacım olmayacak, sadece elimdeki verinin eşlenebilen verileri de içermesi sebebiyle bu adımı ekledim.
Tag: extract unmapped reads
Blog
İlk Adım: Eşleşmeyen Okumaları Elde Etmek
Projemin ilk kısmı daha önce bahsettiğim gibi eşleşmeyen okumaları (unmapped reads) FASTQ dosyasından çıkarmak. Böylece, daha sonraki analizler için elimdeki ihtiyacım olmayan dizileri çıkarmış ve bu analizlerdeki iş yükünü azaltmış oluyorum.
Başından beri hedefim, tüm projeyi adım adım gerçekleştiren bir pipeline tasarlamak olduğu için bu işlemi bir Perl scripti ile yapacağım. Bu script pipeline’in ilk scripti ve laboratuvardan gelecek ham (raw) FASTQ formatındaki verinin girdi (input) olarak kullanılacağı yer. Aslında bu scripte ihtiyacım olmayacak, sadece elimdeki verinin eşlenebilen verileri de içermesi sebebiyle bu adımı ekledim.
Tag: mapped reads
Blog
İlk Adım: Eşleşmeyen Okumaları Elde Etmek
Projemin ilk kısmı daha önce bahsettiğim gibi eşleşmeyen okumaları (unmapped reads) FASTQ dosyasından çıkarmak. Böylece, daha sonraki analizler için elimdeki ihtiyacım olmayan dizileri çıkarmış ve bu analizlerdeki iş yükünü azaltmış oluyorum.
Başından beri hedefim, tüm projeyi adım adım gerçekleştiren bir pipeline tasarlamak olduğu için bu işlemi bir Perl scripti ile yapacağım. Bu script pipeline’in ilk scripti ve laboratuvardan gelecek ham (raw) FASTQ formatındaki verinin girdi (input) olarak kullanılacağı yer. Aslında bu scripte ihtiyacım olmayacak, sadece elimdeki verinin eşlenebilen verileri de içermesi sebebiyle bu adımı ekledim.
Tag: paired-end reads
Blog
İlk Adım: Eşleşmeyen Okumaları Elde Etmek
Projemin ilk kısmı daha önce bahsettiğim gibi eşleşmeyen okumaları (unmapped reads) FASTQ dosyasından çıkarmak. Böylece, daha sonraki analizler için elimdeki ihtiyacım olmayan dizileri çıkarmış ve bu analizlerdeki iş yükünü azaltmış oluyorum.
Başından beri hedefim, tüm projeyi adım adım gerçekleştiren bir pipeline tasarlamak olduğu için bu işlemi bir Perl scripti ile yapacağım. Bu script pipeline’in ilk scripti ve laboratuvardan gelecek ham (raw) FASTQ formatındaki verinin girdi (input) olarak kullanılacağı yer. Aslında bu scripte ihtiyacım olmayacak, sadece elimdeki verinin eşlenebilen verileri de içermesi sebebiyle bu adımı ekledim.
Tag: single-end reads
Blog
İlk Adım: Eşleşmeyen Okumaları Elde Etmek
Projemin ilk kısmı daha önce bahsettiğim gibi eşleşmeyen okumaları (unmapped reads) FASTQ dosyasından çıkarmak. Böylece, daha sonraki analizler için elimdeki ihtiyacım olmayan dizileri çıkarmış ve bu analizlerdeki iş yükünü azaltmış oluyorum.
Başından beri hedefim, tüm projeyi adım adım gerçekleştiren bir pipeline tasarlamak olduğu için bu işlemi bir Perl scripti ile yapacağım. Bu script pipeline’in ilk scripti ve laboratuvardan gelecek ham (raw) FASTQ formatındaki verinin girdi (input) olarak kullanılacağı yer. Aslında bu scripte ihtiyacım olmayacak, sadece elimdeki verinin eşlenebilen verileri de içermesi sebebiyle bu adımı ekledim.
Tag: tek-sonlu okuma
Blog
İlk Adım: Eşleşmeyen Okumaları Elde Etmek
Projemin ilk kısmı daha önce bahsettiğim gibi eşleşmeyen okumaları (unmapped reads) FASTQ dosyasından çıkarmak. Böylece, daha sonraki analizler için elimdeki ihtiyacım olmayan dizileri çıkarmış ve bu analizlerdeki iş yükünü azaltmış oluyorum.
Başından beri hedefim, tüm projeyi adım adım gerçekleştiren bir pipeline tasarlamak olduğu için bu işlemi bir Perl scripti ile yapacağım. Bu script pipeline’in ilk scripti ve laboratuvardan gelecek ham (raw) FASTQ formatındaki verinin girdi (input) olarak kullanılacağı yer. Aslında bu scripte ihtiyacım olmayacak, sadece elimdeki verinin eşlenebilen verileri de içermesi sebebiyle bu adımı ekledim.
Tag: unmapped reads
Blog
İlk Adım: Eşleşmeyen Okumaları Elde Etmek
Projemin ilk kısmı daha önce bahsettiğim gibi eşleşmeyen okumaları (unmapped reads) FASTQ dosyasından çıkarmak. Böylece, daha sonraki analizler için elimdeki ihtiyacım olmayan dizileri çıkarmış ve bu analizlerdeki iş yükünü azaltmış oluyorum.
Başından beri hedefim, tüm projeyi adım adım gerçekleştiren bir pipeline tasarlamak olduğu için bu işlemi bir Perl scripti ile yapacağım. Bu script pipeline’in ilk scripti ve laboratuvardan gelecek ham (raw) FASTQ formatındaki verinin girdi (input) olarak kullanılacağı yer. Aslında bu scripte ihtiyacım olmayacak, sadece elimdeki verinin eşlenebilen verileri de içermesi sebebiyle bu adımı ekledim.
Tag: blog yazılarını facebook'a aktarmak
Blog
Blog Yazılarını Facebook Twitter ve LinkedIn'e Yönlendirmek
İlgilendiğim bir konu üzerine bir blog açıp, bilgilendirici yazılar yazmak uzun süredir aklımda olan bir şeydi. Sonunda ufak ufak yazılarıma başladım. Umarım şu ana kadar güzel gitmiştir.
Bu yazıda blog başlığıyla çok alakalı olmayan “konu-dışı” bir konudan bahsedeceğim.
Yazılarımı kolay bir şekilde geniş kitleye ulaştırmak için sosyal medyayı kullanmak istiyordum ama her seferinde yazının bağlantısını kopyala-yapıştır yapmak hiç de basit bir iş değil.
Aramalarım sonunda bunu, blogumuzu Facebook, Twitter ve LinkedIn hesaplarımıza bağlayarak aynı anda yeni yazıları yönlendirebildiğimiz bir araç buldum.
Tag: blog yazılarını otomatik facebook'ta yayınlamak
Blog
Blog Yazılarını Facebook Twitter ve LinkedIn'e Yönlendirmek
İlgilendiğim bir konu üzerine bir blog açıp, bilgilendirici yazılar yazmak uzun süredir aklımda olan bir şeydi. Sonunda ufak ufak yazılarıma başladım. Umarım şu ana kadar güzel gitmiştir.
Bu yazıda blog başlığıyla çok alakalı olmayan “konu-dışı” bir konudan bahsedeceğim.
Yazılarımı kolay bir şekilde geniş kitleye ulaştırmak için sosyal medyayı kullanmak istiyordum ama her seferinde yazının bağlantısını kopyala-yapıştır yapmak hiç de basit bir iş değil.
Aramalarım sonunda bunu, blogumuzu Facebook, Twitter ve LinkedIn hesaplarımıza bağlayarak aynı anda yeni yazıları yönlendirebildiğimiz bir araç buldum.
Tag: blogdan facebooka
Blog
Blog Yazılarını Facebook Twitter ve LinkedIn'e Yönlendirmek
İlgilendiğim bir konu üzerine bir blog açıp, bilgilendirici yazılar yazmak uzun süredir aklımda olan bir şeydi. Sonunda ufak ufak yazılarıma başladım. Umarım şu ana kadar güzel gitmiştir.
Bu yazıda blog başlığıyla çok alakalı olmayan “konu-dışı” bir konudan bahsedeceğim.
Yazılarımı kolay bir şekilde geniş kitleye ulaştırmak için sosyal medyayı kullanmak istiyordum ama her seferinde yazının bağlantısını kopyala-yapıştır yapmak hiç de basit bir iş değil.
Aramalarım sonunda bunu, blogumuzu Facebook, Twitter ve LinkedIn hesaplarımıza bağlayarak aynı anda yeni yazıları yönlendirebildiğimiz bir araç buldum.
Tag: blogdan linkedine
Blog
Blog Yazılarını Facebook Twitter ve LinkedIn'e Yönlendirmek
İlgilendiğim bir konu üzerine bir blog açıp, bilgilendirici yazılar yazmak uzun süredir aklımda olan bir şeydi. Sonunda ufak ufak yazılarıma başladım. Umarım şu ana kadar güzel gitmiştir.
Bu yazıda blog başlığıyla çok alakalı olmayan “konu-dışı” bir konudan bahsedeceğim.
Yazılarımı kolay bir şekilde geniş kitleye ulaştırmak için sosyal medyayı kullanmak istiyordum ama her seferinde yazının bağlantısını kopyala-yapıştır yapmak hiç de basit bir iş değil.
Aramalarım sonunda bunu, blogumuzu Facebook, Twitter ve LinkedIn hesaplarımıza bağlayarak aynı anda yeni yazıları yönlendirebildiğimiz bir araç buldum.
Tag: blogdan twittera
Blog
Blog Yazılarını Facebook Twitter ve LinkedIn'e Yönlendirmek
İlgilendiğim bir konu üzerine bir blog açıp, bilgilendirici yazılar yazmak uzun süredir aklımda olan bir şeydi. Sonunda ufak ufak yazılarıma başladım. Umarım şu ana kadar güzel gitmiştir.
Bu yazıda blog başlığıyla çok alakalı olmayan “konu-dışı” bir konudan bahsedeceğim.
Yazılarımı kolay bir şekilde geniş kitleye ulaştırmak için sosyal medyayı kullanmak istiyordum ama her seferinde yazının bağlantısını kopyala-yapıştır yapmak hiç de basit bir iş değil.
Aramalarım sonunda bunu, blogumuzu Facebook, Twitter ve LinkedIn hesaplarımıza bağlayarak aynı anda yeni yazıları yönlendirebildiğimiz bir araç buldum.
Tag: blogspot
Blog
Blog Yazılarını Facebook Twitter ve LinkedIn'e Yönlendirmek
İlgilendiğim bir konu üzerine bir blog açıp, bilgilendirici yazılar yazmak uzun süredir aklımda olan bir şeydi. Sonunda ufak ufak yazılarıma başladım. Umarım şu ana kadar güzel gitmiştir.
Bu yazıda blog başlığıyla çok alakalı olmayan “konu-dışı” bir konudan bahsedeceğim.
Yazılarımı kolay bir şekilde geniş kitleye ulaştırmak için sosyal medyayı kullanmak istiyordum ama her seferinde yazının bağlantısını kopyala-yapıştır yapmak hiç de basit bir iş değil.
Aramalarım sonunda bunu, blogumuzu Facebook, Twitter ve LinkedIn hesaplarımıza bağlayarak aynı anda yeni yazıları yönlendirebildiğimiz bir araç buldum.
Tag: facebook
Blog
Blog Yazılarını Facebook Twitter ve LinkedIn'e Yönlendirmek
İlgilendiğim bir konu üzerine bir blog açıp, bilgilendirici yazılar yazmak uzun süredir aklımda olan bir şeydi. Sonunda ufak ufak yazılarıma başladım. Umarım şu ana kadar güzel gitmiştir.
Bu yazıda blog başlığıyla çok alakalı olmayan “konu-dışı” bir konudan bahsedeceğim.
Yazılarımı kolay bir şekilde geniş kitleye ulaştırmak için sosyal medyayı kullanmak istiyordum ama her seferinde yazının bağlantısını kopyala-yapıştır yapmak hiç de basit bir iş değil.
Aramalarım sonunda bunu, blogumuzu Facebook, Twitter ve LinkedIn hesaplarımıza bağlayarak aynı anda yeni yazıları yönlendirebildiğimiz bir araç buldum.
Tag: feed
Blog
Blog Yazılarını Facebook Twitter ve LinkedIn'e Yönlendirmek
İlgilendiğim bir konu üzerine bir blog açıp, bilgilendirici yazılar yazmak uzun süredir aklımda olan bir şeydi. Sonunda ufak ufak yazılarıma başladım. Umarım şu ana kadar güzel gitmiştir.
Bu yazıda blog başlığıyla çok alakalı olmayan “konu-dışı” bir konudan bahsedeceğim.
Yazılarımı kolay bir şekilde geniş kitleye ulaştırmak için sosyal medyayı kullanmak istiyordum ama her seferinde yazının bağlantısını kopyala-yapıştır yapmak hiç de basit bir iş değil.
Aramalarım sonunda bunu, blogumuzu Facebook, Twitter ve LinkedIn hesaplarımıza bağlayarak aynı anda yeni yazıları yönlendirebildiğimiz bir araç buldum.
Tag: linkedin
Blog
Blog Yazılarını Facebook Twitter ve LinkedIn'e Yönlendirmek
İlgilendiğim bir konu üzerine bir blog açıp, bilgilendirici yazılar yazmak uzun süredir aklımda olan bir şeydi. Sonunda ufak ufak yazılarıma başladım. Umarım şu ana kadar güzel gitmiştir.
Bu yazıda blog başlığıyla çok alakalı olmayan “konu-dışı” bir konudan bahsedeceğim.
Yazılarımı kolay bir şekilde geniş kitleye ulaştırmak için sosyal medyayı kullanmak istiyordum ama her seferinde yazının bağlantısını kopyala-yapıştır yapmak hiç de basit bir iş değil.
Aramalarım sonunda bunu, blogumuzu Facebook, Twitter ve LinkedIn hesaplarımıza bağlayarak aynı anda yeni yazıları yönlendirebildiğimiz bir araç buldum.
Tag: twitter
Blog
Blog Yazılarını Facebook Twitter ve LinkedIn'e Yönlendirmek
İlgilendiğim bir konu üzerine bir blog açıp, bilgilendirici yazılar yazmak uzun süredir aklımda olan bir şeydi. Sonunda ufak ufak yazılarıma başladım. Umarım şu ana kadar güzel gitmiştir.
Bu yazıda blog başlığıyla çok alakalı olmayan “konu-dışı” bir konudan bahsedeceğim.
Yazılarımı kolay bir şekilde geniş kitleye ulaştırmak için sosyal medyayı kullanmak istiyordum ama her seferinde yazının bağlantısını kopyala-yapıştır yapmak hiç de basit bir iş değil.
Aramalarım sonunda bunu, blogumuzu Facebook, Twitter ve LinkedIn hesaplarımıza bağlayarak aynı anda yeni yazıları yönlendirebildiğimiz bir araç buldum.
Tag: twitterfeed
Blog
Blog Yazılarını Facebook Twitter ve LinkedIn'e Yönlendirmek
İlgilendiğim bir konu üzerine bir blog açıp, bilgilendirici yazılar yazmak uzun süredir aklımda olan bir şeydi. Sonunda ufak ufak yazılarıma başladım. Umarım şu ana kadar güzel gitmiştir.
Bu yazıda blog başlığıyla çok alakalı olmayan “konu-dışı” bir konudan bahsedeceğim.
Yazılarımı kolay bir şekilde geniş kitleye ulaştırmak için sosyal medyayı kullanmak istiyordum ama her seferinde yazının bağlantısını kopyala-yapıştır yapmak hiç de basit bir iş değil.
Aramalarım sonunda bunu, blogumuzu Facebook, Twitter ve LinkedIn hesaplarımıza bağlayarak aynı anda yeni yazıları yönlendirebildiğimiz bir araç buldum.
Tag: basic local alignment search tool
Blog
MegaBLAST - Dizilerdeki Benzerlikleri Bulma Aracı
MegaBLAST, HUSAR paketinde bulunan, BLAST (Basic Local Alignment Search Tool) paketinin bir parçası. Ayrıca BLASTN’in bir değişik türü. MegaBLAST uzun dizileri BLASTN’den daha etkili bir şekilde işliyor ve hem de çok daha hızlı işlem yapiyor ancak daha az duyarlı. Bu yüzden benzer dizileri geniş veri tabanlarında aramaya çok uygun bir araç.
Yazacağım program çoklu dizilim barındıran FASTA dosyasını alacak ve megablast komutunu çalıştıracak. Daha sonra da her okuma için bir .
Tag: blastn
Blog
MegaBLAST - Dizilerdeki Benzerlikleri Bulma Aracı
MegaBLAST, HUSAR paketinde bulunan, BLAST (Basic Local Alignment Search Tool) paketinin bir parçası. Ayrıca BLASTN’in bir değişik türü. MegaBLAST uzun dizileri BLASTN’den daha etkili bir şekilde işliyor ve hem de çok daha hızlı işlem yapiyor ancak daha az duyarlı. Bu yüzden benzer dizileri geniş veri tabanlarında aramaya çok uygun bir araç.
Yazacağım program çoklu dizilim barındıran FASTA dosyasını alacak ve megablast komutunu çalıştıracak. Daha sonra da her okuma için bir .
Tag: dkfz
Blog
MegaBLAST - Dizilerdeki Benzerlikleri Bulma Aracı
MegaBLAST, HUSAR paketinde bulunan, BLAST (Basic Local Alignment Search Tool) paketinin bir parçası. Ayrıca BLASTN’in bir değişik türü. MegaBLAST uzun dizileri BLASTN’den daha etkili bir şekilde işliyor ve hem de çok daha hızlı işlem yapiyor ancak daha az duyarlı. Bu yüzden benzer dizileri geniş veri tabanlarında aramaya çok uygun bir araç.
Yazacağım program çoklu dizilim barındıran FASTA dosyasını alacak ve megablast komutunu çalıştıracak. Daha sonra da her okuma için bir .
Blog
Kontaminant (Kirletici) Analizi Projesi
Başlangıç olarak, araçlara, programlama diline, kısacası biyoenformatiğe alışabilmem için bana verilen bu ufak projeyi ayrıntılı olarak anlatacağım.
Biliyoruz ki, laboratuvar çalışmalarımızda ne kadar önlemeye çalışsak da kontaminant riski hep bulunuyor. Bunu ne kadar aza indirsek o kadar iyi, ki daha sonra bunun miktarını bulup, bunun üzerinden sonucumuzun bir başka değerlendirmesini de yapabiliriz. İşte bunu bulmak için bir yöntem, DNA analizi. Çalıştığınız örneğinizin DNA’sı dizileniyor ve bu DNA çeşitli programlarla analiz edilip, kirleten organizmaları DNA’larından ortaya çıkarabiliyoruz
Blog
FASTQ Formatı - FASTQ Dosyası
Bugün programı oluştururken kullanacağım “test” dizilimini aldım. İki adet FASTQ dosyasından oluşuyor, her biri sıkıştırılmış ama buna rağmen boyutları 6 GB civarı. Ben elbette çok zaman kaybetmek istemediğim için bu dosyalardan birinin sadece bir kısmını kullanacağım.
Amacım, bu FASTQ dosyalarındaki eşleşebilen okumaları BWA aracı ile bularak, daha sonra onları çıkarmak. Ve kalan eşleşemeyen okumaları MegaBLAST aracının anlayabileceği bir dilde (FASTA formatında) kaydetmek.
Bu arada tüm projeyi bir Unix bilgisayarda hazırladığım için birçok komut öğreniyorum, daha sonra bunları ayrıca yazmaya çalışacağım.
Blog
BWA (Burrows-Wheeler Aligner) Hizalayıcı - Eşleştirici
Önceki yazımda belirttiğim gibi bir eşleştirici (aligner ya da mapper) kullanarak elimdeki verinin referans genomu ile ne derece eşlestiğini bulmaya çalışacağım. Daha sonra eşleşmeyen kısmıyla birtakım analizler yapacağım.
BWA (Burrows-Wheeler Aligner) görece kısa dizilimleri insan genomu gibi uzun referans genomlarıyla eşleştiren bir program. 200bp (bp: baz çifti) uzunluğuna kadar bwa-short algoritması, 200bp - 100kbp arası ise BWA-SW algoritması kullanılıyor.
Hizalayıcı - eşleştirici seçmede birçok faktör rol oynuyor. Birçok bu tip araç var ve farklı özelliklere sahipler.
Blog
Pipeline ve Pipeline Geliştirme
Bugün aldığım tanıtım derslerinin devamında, pipeline ve pipeline geliştirme ile ilgili ayrıntılı bilgiler aldım. Pipeline, aslında bildiğimiz boru hattı demek, örneğin borularla petrolün bir yerden başka bir yere taşınması için kullanılan sistem. Bunun bilgisayar terminolojisinde anlamı ise bir elementin çıktısı, diğerinin girdisi olacak şekilde oluşturulmuş işleme elementleri zinciri. Böylece çok daha komplike işlemler pipeline oluşturularak, kolay ve düzenli bir biçimde gerçekleştiriliyor. Sanırım pipeline Türkçeye ardışık düzen olarak çevriliyor, gene de ben pipeline olarak kullanacağım.
Blog
WWW2HUSAR - HUSAR'ın Web Arayüzü
Stajımın ikinci gününde HUSAR’ın web arayüzünü konuştuk. HUSAR komut isteminden komutlarla kullanılabilen, yönetilebilen bir yazılım ancak bunu kolaylaştırmak için hazırlanmış bir web arayüzü var. WWW2HUSAR adını verdikleri bu arayüz ile listelenen araçları kolayca seçebiliyor, genetik dizinizi ekleyebiliyor ve başka birçok işlemi kolayca, birkaç tık ile yapabiliyorsunuz.
Bununla birlikte biraz daha HUSAR’ın işlevlerine göz attık. Yazılımda, yerel klasörde gen dizisi listeleri oluşturarak, bunları çoklu dizi hizalama (multiple sequence alignment) aracı ile genlerin benzerliklerini karşılastırabiliyor ve örneğin evrimsel ilişkilerini ortaya çıkarabiliyorsunuz.
Blog
DKFZ - Heidelberg Biyoenformatik Birimi'nde Staj
Erasmus programıyla yapıyor olduğum yaz stajı başladı. İlk olarak birimi yöneten bilim insanlarından birkaç saatlik tanıtım dersi aldım. Bu derste birimin kısa tarihi, birimin günümüze kadar yaptıkları projeler ve bunlarin ayrintilari konusunda bilgiler aldım.
Biyoenformatik Birimi DKFZ’nin (Deutsches Krebsforschungszentrum – ing. German Cancer Research Center) bir çekirdek tesisi olan Genomik ve Proteomik Çekirdek Tesisi’ne bağlı bir grup. İsimleri aynı zamanda HUSAR (Heidelberg Unix Sequence Analysis Resources) ve bu isim grubun geliştirdiği dizi analizi yapma paketinin de adı olarak kullanılıyor.
Tag: biyoenformatik
Blog
Kontaminant (Kirletici) Analizi Projesi
Başlangıç olarak, araçlara, programlama diline, kısacası biyoenformatiğe alışabilmem için bana verilen bu ufak projeyi ayrıntılı olarak anlatacağım.
Biliyoruz ki, laboratuvar çalışmalarımızda ne kadar önlemeye çalışsak da kontaminant riski hep bulunuyor. Bunu ne kadar aza indirsek o kadar iyi, ki daha sonra bunun miktarını bulup, bunun üzerinden sonucumuzun bir başka değerlendirmesini de yapabiliriz. İşte bunu bulmak için bir yöntem, DNA analizi. Çalıştığınız örneğinizin DNA’sı dizileniyor ve bu DNA çeşitli programlarla analiz edilip, kirleten organizmaları DNA’larından ortaya çıkarabiliyoruz
Blog
Pipeline ve Pipeline Geliştirme
Bugün aldığım tanıtım derslerinin devamında, pipeline ve pipeline geliştirme ile ilgili ayrıntılı bilgiler aldım. Pipeline, aslında bildiğimiz boru hattı demek, örneğin borularla petrolün bir yerden başka bir yere taşınması için kullanılan sistem. Bunun bilgisayar terminolojisinde anlamı ise bir elementin çıktısı, diğerinin girdisi olacak şekilde oluşturulmuş işleme elementleri zinciri. Böylece çok daha komplike işlemler pipeline oluşturularak, kolay ve düzenli bir biçimde gerçekleştiriliyor. Sanırım pipeline Türkçeye ardışık düzen olarak çevriliyor, gene de ben pipeline olarak kullanacağım.
Blog
WWW2HUSAR - HUSAR'ın Web Arayüzü
Stajımın ikinci gününde HUSAR’ın web arayüzünü konuştuk. HUSAR komut isteminden komutlarla kullanılabilen, yönetilebilen bir yazılım ancak bunu kolaylaştırmak için hazırlanmış bir web arayüzü var. WWW2HUSAR adını verdikleri bu arayüz ile listelenen araçları kolayca seçebiliyor, genetik dizinizi ekleyebiliyor ve başka birçok işlemi kolayca, birkaç tık ile yapabiliyorsunuz.
Bununla birlikte biraz daha HUSAR’ın işlevlerine göz attık. Yazılımda, yerel klasörde gen dizisi listeleri oluşturarak, bunları çoklu dizi hizalama (multiple sequence alignment) aracı ile genlerin benzerliklerini karşılastırabiliyor ve örneğin evrimsel ilişkilerini ortaya çıkarabiliyorsunuz.
Blog
Biyoinformatik mi? Yoksa Biyoenformatik mi?
Yazılarıma konu ararken kitaplarla birlikte interneti de karıştırıyorum. Yabancı kaynaklar elbette fazlaca var ve yeterliler, ancak Türkçe kaynaklara baktığımda ilk gözüme çarpan bu alanın isminin farklı kullanımları oldu.
Biliyorsunuz, İngilizcede bu alana bioinformatics deniyor. Gayet normal, çünkü İngilizcede informatics ics eki ile birlikte information sözcüğünden geliyor. Bu sözcük ise Latince kökene sahip1. Enformatik sözcüğü Türkçeye, Fransızcadan informatique sözcüğünden, enformatik olarak gelmiş, ayrıca bilişim olarak da Türkçesi önerilmiş2. Elbette bu Fransızca sözcük de İngilizcesi ile aynı kökene sahip.
Blog
7th International Symposium on Health Informatics and Bioinformatics
7. Sağlık Enformatiği ve Biyoenformatik üzerine Uluslararası Sempozyumu, 7th International Symposium on Health Informatics and Bioinformatics (HIBIT 2012), ilk kez 2005’te ODTÜ Enformatik Enstitüsü tarafından düzenlenmiş ve Sağlık Enformatiği, Tıbbi Enformatik, Hesaplamalı Biyoloji ve Biyoenformatik alanlarında akademisyenleri ve araştırmacıları bir araya getirmeyi ve bu alanlar hakkında yapılan çalışmaların sunulmasına ortam sağlamayı ve çalışmalar üzerine interaktif bir şekilde değerlendirmeler yapmayı amaçlamaktadır.
Bu sene, 19-22 Nisan 2012’de Ürgüp, Nevşehir Perissia Hotel’de düzenlenecek olan HIBIT 2012 organizasyonu ODTÜ, ODTÜ Enformatik Enstitüsü, ODTÜ Biyolojik Bilimler Bölümü ve ODTÜ Bilgisayar Mühendisliği Bölümü partnerliği ile gerçekleştirilmektedir.
Blog
Biyoenformatik Nedir? Biyoenformatik'in Tanımı
Birçok organizmanın ve son olarak da 2001’de insan genomunun çıkarılmasıyla, tüm 3 milyar baz çiftinin diziliminin elde edilmesiyle, karşımıza bu bilgiyi farklı şekillerde kullanacak olan alanlar çıktı.Bu genleri anlamaya çalışan, bu genlerden oluşacak proteinleri belirlemeye çalışan alanların yanında bu bilginin analizini yapma ihtiyacı da Biyoenformatik alanını doğurdu.
Biyoenformatik, biyolojik bilginin bilgisayarlar ve istatistiksel teknikler kullanılarak analiz edilmesidir; başka bir deyişle, biyoenformatik, biyolojik araştırmaları iyileştirmek ve hızlandırmak için bilgisayar veri tabanları ve algoritmaları geliştirme ve onlardan yarar sağlama bilimidir [1].
Blog
Hoş Geldim! Hoş Geldiniz!
Merhabalar,
Biyoloji alanında özel olarak ilgi alanım olan ve daha fazla keşfetmem, üzerine çok şey öğrenmem gereken Biyoenformatik’i, bu blog aracılığıyla (olası ziyaretçilerimle birlikte) öğreneceğim. İlk yazımı biraz önce Biyoenformatik’in çeşitli otoriteler tarafından yapılan tanımları ile tamamladım. Daha sonra, Biyoenformatik’te geçen birçok ilkelerin tanımlarından da bahsetmek istiyorum. Ayrıca, Biyoenformatik hakkında yazılım dilleri, istatiksel yöntemler de yazılarımın konularını oluşturacak. Aynı zamanda Biyoenformatik ile ilgili haberlere de yer vermek ve bu haberlerle en son gelişmeleri takip etmeyi (ettirmeyi) planlıyorum.
Tag: dna dizisi
Blog
Kontaminant (Kirletici) Analizi Projesi
Başlangıç olarak, araçlara, programlama diline, kısacası biyoenformatiğe alışabilmem için bana verilen bu ufak projeyi ayrıntılı olarak anlatacağım.
Biliyoruz ki, laboratuvar çalışmalarımızda ne kadar önlemeye çalışsak da kontaminant riski hep bulunuyor. Bunu ne kadar aza indirsek o kadar iyi, ki daha sonra bunun miktarını bulup, bunun üzerinden sonucumuzun bir başka değerlendirmesini de yapabiliriz. İşte bunu bulmak için bir yöntem, DNA analizi. Çalıştığınız örneğinizin DNA’sı dizileniyor ve bu DNA çeşitli programlarla analiz edilip, kirleten organizmaları DNA’larından ortaya çıkarabiliyoruz
Tag: kirletici analizi
Blog
Kontaminant (Kirletici) Analizi Projesi
Başlangıç olarak, araçlara, programlama diline, kısacası biyoenformatiğe alışabilmem için bana verilen bu ufak projeyi ayrıntılı olarak anlatacağım.
Biliyoruz ki, laboratuvar çalışmalarımızda ne kadar önlemeye çalışsak da kontaminant riski hep bulunuyor. Bunu ne kadar aza indirsek o kadar iyi, ki daha sonra bunun miktarını bulup, bunun üzerinden sonucumuzun bir başka değerlendirmesini de yapabiliriz. İşte bunu bulmak için bir yöntem, DNA analizi. Çalıştığınız örneğinizin DNA’sı dizileniyor ve bu DNA çeşitli programlarla analiz edilip, kirleten organizmaları DNA’larından ortaya çıkarabiliyoruz
Tag: kontaminant analizi
Blog
Kontaminant (Kirletici) Analizi Projesi
Başlangıç olarak, araçlara, programlama diline, kısacası biyoenformatiğe alışabilmem için bana verilen bu ufak projeyi ayrıntılı olarak anlatacağım.
Biliyoruz ki, laboratuvar çalışmalarımızda ne kadar önlemeye çalışsak da kontaminant riski hep bulunuyor. Bunu ne kadar aza indirsek o kadar iyi, ki daha sonra bunun miktarını bulup, bunun üzerinden sonucumuzun bir başka değerlendirmesini de yapabiliriz. İşte bunu bulmak için bir yöntem, DNA analizi. Çalıştığınız örneğinizin DNA’sı dizileniyor ve bu DNA çeşitli programlarla analiz edilip, kirleten organizmaları DNA’larından ortaya çıkarabiliyoruz
Tag: tel aviv üniversitesi
Blog
Kontaminant (Kirletici) Analizi Projesi
Başlangıç olarak, araçlara, programlama diline, kısacası biyoenformatiğe alışabilmem için bana verilen bu ufak projeyi ayrıntılı olarak anlatacağım.
Biliyoruz ki, laboratuvar çalışmalarımızda ne kadar önlemeye çalışsak da kontaminant riski hep bulunuyor. Bunu ne kadar aza indirsek o kadar iyi, ki daha sonra bunun miktarını bulup, bunun üzerinden sonucumuzun bir başka değerlendirmesini de yapabiliriz. İşte bunu bulmak için bir yöntem, DNA analizi. Çalıştığınız örneğinizin DNA’sı dizileniyor ve bu DNA çeşitli programlarla analiz edilip, kirleten organizmaları DNA’larından ortaya çıkarabiliyoruz
Tag: gzip
Blog
FASTQ Formatı - FASTQ Dosyası
Bugün programı oluştururken kullanacağım “test” dizilimini aldım. İki adet FASTQ dosyasından oluşuyor, her biri sıkıştırılmış ama buna rağmen boyutları 6 GB civarı. Ben elbette çok zaman kaybetmek istemediğim için bu dosyalardan birinin sadece bir kısmını kullanacağım.
Amacım, bu FASTQ dosyalarındaki eşleşebilen okumaları BWA aracı ile bularak, daha sonra onları çıkarmak. Ve kalan eşleşemeyen okumaları MegaBLAST aracının anlayabileceği bir dilde (FASTA formatında) kaydetmek.
Bu arada tüm projeyi bir Unix bilgisayarda hazırladığım için birçok komut öğreniyorum, daha sonra bunları ayrıca yazmaya çalışacağım.
Tag: header
Blog
FASTQ Formatı - FASTQ Dosyası
Bugün programı oluştururken kullanacağım “test” dizilimini aldım. İki adet FASTQ dosyasından oluşuyor, her biri sıkıştırılmış ama buna rağmen boyutları 6 GB civarı. Ben elbette çok zaman kaybetmek istemediğim için bu dosyalardan birinin sadece bir kısmını kullanacağım.
Amacım, bu FASTQ dosyalarındaki eşleşebilen okumaları BWA aracı ile bularak, daha sonra onları çıkarmak. Ve kalan eşleşemeyen okumaları MegaBLAST aracının anlayabileceği bir dilde (FASTA formatında) kaydetmek.
Bu arada tüm projeyi bir Unix bilgisayarda hazırladığım için birçok komut öğreniyorum, daha sonra bunları ayrıca yazmaya çalışacağım.
Tag: identifier
Blog
FASTQ Formatı - FASTQ Dosyası
Bugün programı oluştururken kullanacağım “test” dizilimini aldım. İki adet FASTQ dosyasından oluşuyor, her biri sıkıştırılmış ama buna rağmen boyutları 6 GB civarı. Ben elbette çok zaman kaybetmek istemediğim için bu dosyalardan birinin sadece bir kısmını kullanacağım.
Amacım, bu FASTQ dosyalarındaki eşleşebilen okumaları BWA aracı ile bularak, daha sonra onları çıkarmak. Ve kalan eşleşemeyen okumaları MegaBLAST aracının anlayabileceği bir dilde (FASTA formatında) kaydetmek.
Bu arada tüm projeyi bir Unix bilgisayarda hazırladığım için birçok komut öğreniyorum, daha sonra bunları ayrıca yazmaya çalışacağım.
Tag: accuracy
Blog
BWA (Burrows-Wheeler Aligner) Hizalayıcı - Eşleştirici
Önceki yazımda belirttiğim gibi bir eşleştirici (aligner ya da mapper) kullanarak elimdeki verinin referans genomu ile ne derece eşlestiğini bulmaya çalışacağım. Daha sonra eşleşmeyen kısmıyla birtakım analizler yapacağım.
BWA (Burrows-Wheeler Aligner) görece kısa dizilimleri insan genomu gibi uzun referans genomlarıyla eşleştiren bir program. 200bp (bp: baz çifti) uzunluğuna kadar bwa-short algoritması, 200bp - 100kbp arası ise BWA-SW algoritması kullanılıyor.
Hizalayıcı - eşleştirici seçmede birçok faktör rol oynuyor. Birçok bu tip araç var ve farklı özelliklere sahipler.
Tag: burrows-wheeler aligner
Blog
BWA (Burrows-Wheeler Aligner) Hizalayıcı - Eşleştirici
Önceki yazımda belirttiğim gibi bir eşleştirici (aligner ya da mapper) kullanarak elimdeki verinin referans genomu ile ne derece eşlestiğini bulmaya çalışacağım. Daha sonra eşleşmeyen kısmıyla birtakım analizler yapacağım.
BWA (Burrows-Wheeler Aligner) görece kısa dizilimleri insan genomu gibi uzun referans genomlarıyla eşleştiren bir program. 200bp (bp: baz çifti) uzunluğuna kadar bwa-short algoritması, 200bp - 100kbp arası ise BWA-SW algoritması kullanılıyor.
Hizalayıcı - eşleştirici seçmede birçok faktör rol oynuyor. Birçok bu tip araç var ve farklı özelliklere sahipler.
Tag: bwa mapper
Blog
BWA (Burrows-Wheeler Aligner) Hizalayıcı - Eşleştirici
Önceki yazımda belirttiğim gibi bir eşleştirici (aligner ya da mapper) kullanarak elimdeki verinin referans genomu ile ne derece eşlestiğini bulmaya çalışacağım. Daha sonra eşleşmeyen kısmıyla birtakım analizler yapacağım.
BWA (Burrows-Wheeler Aligner) görece kısa dizilimleri insan genomu gibi uzun referans genomlarıyla eşleştiren bir program. 200bp (bp: baz çifti) uzunluğuna kadar bwa-short algoritması, 200bp - 100kbp arası ise BWA-SW algoritması kullanılıyor.
Hizalayıcı - eşleştirici seçmede birçok faktör rol oynuyor. Birçok bu tip araç var ve farklı özelliklere sahipler.
Tag: eşleştirici
Blog
BWA (Burrows-Wheeler Aligner) Hizalayıcı - Eşleştirici
Önceki yazımda belirttiğim gibi bir eşleştirici (aligner ya da mapper) kullanarak elimdeki verinin referans genomu ile ne derece eşlestiğini bulmaya çalışacağım. Daha sonra eşleşmeyen kısmıyla birtakım analizler yapacağım.
BWA (Burrows-Wheeler Aligner) görece kısa dizilimleri insan genomu gibi uzun referans genomlarıyla eşleştiren bir program. 200bp (bp: baz çifti) uzunluğuna kadar bwa-short algoritması, 200bp - 100kbp arası ise BWA-SW algoritması kullanılıyor.
Hizalayıcı - eşleştirici seçmede birçok faktör rol oynuyor. Birçok bu tip araç var ve farklı özelliklere sahipler.
Tag: genom eşleştirici
Blog
BWA (Burrows-Wheeler Aligner) Hizalayıcı - Eşleştirici
Önceki yazımda belirttiğim gibi bir eşleştirici (aligner ya da mapper) kullanarak elimdeki verinin referans genomu ile ne derece eşlestiğini bulmaya çalışacağım. Daha sonra eşleşmeyen kısmıyla birtakım analizler yapacağım.
BWA (Burrows-Wheeler Aligner) görece kısa dizilimleri insan genomu gibi uzun referans genomlarıyla eşleştiren bir program. 200bp (bp: baz çifti) uzunluğuna kadar bwa-short algoritması, 200bp - 100kbp arası ise BWA-SW algoritması kullanılıyor.
Hizalayıcı - eşleştirici seçmede birçok faktör rol oynuyor. Birçok bu tip araç var ve farklı özelliklere sahipler.
Tag: hizalıyıcı
Blog
BWA (Burrows-Wheeler Aligner) Hizalayıcı - Eşleştirici
Önceki yazımda belirttiğim gibi bir eşleştirici (aligner ya da mapper) kullanarak elimdeki verinin referans genomu ile ne derece eşlestiğini bulmaya çalışacağım. Daha sonra eşleşmeyen kısmıyla birtakım analizler yapacağım.
BWA (Burrows-Wheeler Aligner) görece kısa dizilimleri insan genomu gibi uzun referans genomlarıyla eşleştiren bir program. 200bp (bp: baz çifti) uzunluğuna kadar bwa-short algoritması, 200bp - 100kbp arası ise BWA-SW algoritması kullanılıyor.
Hizalayıcı - eşleştirici seçmede birçok faktör rol oynuyor. Birçok bu tip araç var ve farklı özelliklere sahipler.
Tag: structural variations
Blog
BWA (Burrows-Wheeler Aligner) Hizalayıcı - Eşleştirici
Önceki yazımda belirttiğim gibi bir eşleştirici (aligner ya da mapper) kullanarak elimdeki verinin referans genomu ile ne derece eşlestiğini bulmaya çalışacağım. Daha sonra eşleşmeyen kısmıyla birtakım analizler yapacağım.
BWA (Burrows-Wheeler Aligner) görece kısa dizilimleri insan genomu gibi uzun referans genomlarıyla eşleştiren bir program. 200bp (bp: baz çifti) uzunluğuna kadar bwa-short algoritması, 200bp - 100kbp arası ise BWA-SW algoritması kullanılıyor.
Hizalayıcı - eşleştirici seçmede birçok faktör rol oynuyor. Birçok bu tip araç var ve farklı özelliklere sahipler.
Tag: bakteri
Blog
Dizileme Çalışmalarını Kirleten Organizmaları Tespit Etme
Bu yaz stajımda ilk olarak başlayacağım çalışma yavaş yavaş şekilleniyor. Bu çalışmada bir pipeline oluşturup, bunu laboratuvarlarda dizileme (sequencing) örneklerini kirleten organizmaları bulmaya çalışacağım.
Laboratuvarlarda birçok nedenden dolayı örnekler başka organizmalar ya da yabancı DNA tarafından kirlenebiliyor. Bunlar bakteri, maya olabilir ya da bir virüs DNA’sı da olabilir. Siz bir DNA’yı diziledikten sonra onun referansıyla eşleştirme çok az oranda çıkabiliyor. Bu da yabancı DNA’nın olabileceğini gösteriyor. Bir başka neden referans DNA’nın farklı olması da olabilir.
Tag: dizileme
Blog
Dizileme Çalışmalarını Kirleten Organizmaları Tespit Etme
Bu yaz stajımda ilk olarak başlayacağım çalışma yavaş yavaş şekilleniyor. Bu çalışmada bir pipeline oluşturup, bunu laboratuvarlarda dizileme (sequencing) örneklerini kirleten organizmaları bulmaya çalışacağım.
Laboratuvarlarda birçok nedenden dolayı örnekler başka organizmalar ya da yabancı DNA tarafından kirlenebiliyor. Bunlar bakteri, maya olabilir ya da bir virüs DNA’sı da olabilir. Siz bir DNA’yı diziledikten sonra onun referansıyla eşleştirme çok az oranda çıkabiliyor. Bu da yabancı DNA’nın olabileceğini gösteriyor. Bir başka neden referans DNA’nın farklı olması da olabilir.
Tag: genom dizileme
Blog
Dizileme Çalışmalarını Kirleten Organizmaları Tespit Etme
Bu yaz stajımda ilk olarak başlayacağım çalışma yavaş yavaş şekilleniyor. Bu çalışmada bir pipeline oluşturup, bunu laboratuvarlarda dizileme (sequencing) örneklerini kirleten organizmaları bulmaya çalışacağım.
Laboratuvarlarda birçok nedenden dolayı örnekler başka organizmalar ya da yabancı DNA tarafından kirlenebiliyor. Bunlar bakteri, maya olabilir ya da bir virüs DNA’sı da olabilir. Siz bir DNA’yı diziledikten sonra onun referansıyla eşleştirme çok az oranda çıkabiliyor. Bu da yabancı DNA’nın olabileceğini gösteriyor. Bir başka neden referans DNA’nın farklı olması da olabilir.
Tag: kirletici
Blog
Dizileme Çalışmalarını Kirleten Organizmaları Tespit Etme
Bu yaz stajımda ilk olarak başlayacağım çalışma yavaş yavaş şekilleniyor. Bu çalışmada bir pipeline oluşturup, bunu laboratuvarlarda dizileme (sequencing) örneklerini kirleten organizmaları bulmaya çalışacağım.
Laboratuvarlarda birçok nedenden dolayı örnekler başka organizmalar ya da yabancı DNA tarafından kirlenebiliyor. Bunlar bakteri, maya olabilir ya da bir virüs DNA’sı da olabilir. Siz bir DNA’yı diziledikten sonra onun referansıyla eşleştirme çok az oranda çıkabiliyor. Bu da yabancı DNA’nın olabileceğini gösteriyor. Bir başka neden referans DNA’nın farklı olması da olabilir.
Tag: maya
Blog
Dizileme Çalışmalarını Kirleten Organizmaları Tespit Etme
Bu yaz stajımda ilk olarak başlayacağım çalışma yavaş yavaş şekilleniyor. Bu çalışmada bir pipeline oluşturup, bunu laboratuvarlarda dizileme (sequencing) örneklerini kirleten organizmaları bulmaya çalışacağım.
Laboratuvarlarda birçok nedenden dolayı örnekler başka organizmalar ya da yabancı DNA tarafından kirlenebiliyor. Bunlar bakteri, maya olabilir ya da bir virüs DNA’sı da olabilir. Siz bir DNA’yı diziledikten sonra onun referansıyla eşleştirme çok az oranda çıkabiliyor. Bu da yabancı DNA’nın olabileceğini gösteriyor. Bir başka neden referans DNA’nın farklı olması da olabilir.
Tag: pipeline development
Blog
Dizileme Çalışmalarını Kirleten Organizmaları Tespit Etme
Bu yaz stajımda ilk olarak başlayacağım çalışma yavaş yavaş şekilleniyor. Bu çalışmada bir pipeline oluşturup, bunu laboratuvarlarda dizileme (sequencing) örneklerini kirleten organizmaları bulmaya çalışacağım.
Laboratuvarlarda birçok nedenden dolayı örnekler başka organizmalar ya da yabancı DNA tarafından kirlenebiliyor. Bunlar bakteri, maya olabilir ya da bir virüs DNA’sı da olabilir. Siz bir DNA’yı diziledikten sonra onun referansıyla eşleştirme çok az oranda çıkabiliyor. Bu da yabancı DNA’nın olabileceğini gösteriyor. Bir başka neden referans DNA’nın farklı olması da olabilir.
Blog
Pipeline ve Pipeline Geliştirme
Bugün aldığım tanıtım derslerinin devamında, pipeline ve pipeline geliştirme ile ilgili ayrıntılı bilgiler aldım. Pipeline, aslında bildiğimiz boru hattı demek, örneğin borularla petrolün bir yerden başka bir yere taşınması için kullanılan sistem. Bunun bilgisayar terminolojisinde anlamı ise bir elementin çıktısı, diğerinin girdisi olacak şekilde oluşturulmuş işleme elementleri zinciri. Böylece çok daha komplike işlemler pipeline oluşturularak, kolay ve düzenli bir biçimde gerçekleştiriliyor. Sanırım pipeline Türkçeye ardışık düzen olarak çevriliyor, gene de ben pipeline olarak kullanacağım.
Tag: pipeline geliştirme
Blog
Dizileme Çalışmalarını Kirleten Organizmaları Tespit Etme
Bu yaz stajımda ilk olarak başlayacağım çalışma yavaş yavaş şekilleniyor. Bu çalışmada bir pipeline oluşturup, bunu laboratuvarlarda dizileme (sequencing) örneklerini kirleten organizmaları bulmaya çalışacağım.
Laboratuvarlarda birçok nedenden dolayı örnekler başka organizmalar ya da yabancı DNA tarafından kirlenebiliyor. Bunlar bakteri, maya olabilir ya da bir virüs DNA’sı da olabilir. Siz bir DNA’yı diziledikten sonra onun referansıyla eşleştirme çok az oranda çıkabiliyor. Bu da yabancı DNA’nın olabileceğini gösteriyor. Bir başka neden referans DNA’nın farklı olması da olabilir.
Blog
Pipeline ve Pipeline Geliştirme
Bugün aldığım tanıtım derslerinin devamında, pipeline ve pipeline geliştirme ile ilgili ayrıntılı bilgiler aldım. Pipeline, aslında bildiğimiz boru hattı demek, örneğin borularla petrolün bir yerden başka bir yere taşınması için kullanılan sistem. Bunun bilgisayar terminolojisinde anlamı ise bir elementin çıktısı, diğerinin girdisi olacak şekilde oluşturulmuş işleme elementleri zinciri. Böylece çok daha komplike işlemler pipeline oluşturularak, kolay ve düzenli bir biçimde gerçekleştiriliyor. Sanırım pipeline Türkçeye ardışık düzen olarak çevriliyor, gene de ben pipeline olarak kullanacağım.
Tag: viral dna
Blog
Dizileme Çalışmalarını Kirleten Organizmaları Tespit Etme
Bu yaz stajımda ilk olarak başlayacağım çalışma yavaş yavaş şekilleniyor. Bu çalışmada bir pipeline oluşturup, bunu laboratuvarlarda dizileme (sequencing) örneklerini kirleten organizmaları bulmaya çalışacağım.
Laboratuvarlarda birçok nedenden dolayı örnekler başka organizmalar ya da yabancı DNA tarafından kirlenebiliyor. Bunlar bakteri, maya olabilir ya da bir virüs DNA’sı da olabilir. Siz bir DNA’yı diziledikten sonra onun referansıyla eşleştirme çok az oranda çıkabiliyor. Bu da yabancı DNA’nın olabileceğini gösteriyor. Bir başka neden referans DNA’nın farklı olması da olabilir.
Tag: virüs
Blog
Dizileme Çalışmalarını Kirleten Organizmaları Tespit Etme
Bu yaz stajımda ilk olarak başlayacağım çalışma yavaş yavaş şekilleniyor. Bu çalışmada bir pipeline oluşturup, bunu laboratuvarlarda dizileme (sequencing) örneklerini kirleten organizmaları bulmaya çalışacağım.
Laboratuvarlarda birçok nedenden dolayı örnekler başka organizmalar ya da yabancı DNA tarafından kirlenebiliyor. Bunlar bakteri, maya olabilir ya da bir virüs DNA’sı da olabilir. Siz bir DNA’yı diziledikten sonra onun referansıyla eşleştirme çok az oranda çıkabiliyor. Bu da yabancı DNA’nın olabileceğini gösteriyor. Bir başka neden referans DNA’nın farklı olması da olabilir.
Tag: erasmus
Blog
Pipeline ve Pipeline Geliştirme
Bugün aldığım tanıtım derslerinin devamında, pipeline ve pipeline geliştirme ile ilgili ayrıntılı bilgiler aldım. Pipeline, aslında bildiğimiz boru hattı demek, örneğin borularla petrolün bir yerden başka bir yere taşınması için kullanılan sistem. Bunun bilgisayar terminolojisinde anlamı ise bir elementin çıktısı, diğerinin girdisi olacak şekilde oluşturulmuş işleme elementleri zinciri. Böylece çok daha komplike işlemler pipeline oluşturularak, kolay ve düzenli bir biçimde gerçekleştiriliyor. Sanırım pipeline Türkçeye ardışık düzen olarak çevriliyor, gene de ben pipeline olarak kullanacağım.
Blog
WWW2HUSAR - HUSAR'ın Web Arayüzü
Stajımın ikinci gününde HUSAR’ın web arayüzünü konuştuk. HUSAR komut isteminden komutlarla kullanılabilen, yönetilebilen bir yazılım ancak bunu kolaylaştırmak için hazırlanmış bir web arayüzü var. WWW2HUSAR adını verdikleri bu arayüz ile listelenen araçları kolayca seçebiliyor, genetik dizinizi ekleyebiliyor ve başka birçok işlemi kolayca, birkaç tık ile yapabiliyorsunuz.
Bununla birlikte biraz daha HUSAR’ın işlevlerine göz attık. Yazılımda, yerel klasörde gen dizisi listeleri oluşturarak, bunları çoklu dizi hizalama (multiple sequence alignment) aracı ile genlerin benzerliklerini karşılastırabiliyor ve örneğin evrimsel ilişkilerini ortaya çıkarabiliyorsunuz.
Blog
DKFZ - Heidelberg Biyoenformatik Birimi'nde Staj
Erasmus programıyla yapıyor olduğum yaz stajı başladı. İlk olarak birimi yöneten bilim insanlarından birkaç saatlik tanıtım dersi aldım. Bu derste birimin kısa tarihi, birimin günümüze kadar yaptıkları projeler ve bunlarin ayrintilari konusunda bilgiler aldım.
Biyoenformatik Birimi DKFZ’nin (Deutsches Krebsforschungszentrum – ing. German Cancer Research Center) bir çekirdek tesisi olan Genomik ve Proteomik Çekirdek Tesisi’ne bağlı bir grup. İsimleri aynı zamanda HUSAR (Heidelberg Unix Sequence Analysis Resources) ve bu isim grubun geliştirdiği dizi analizi yapma paketinin de adı olarak kullanılıyor.
Tag: w2h
Blog
Pipeline ve Pipeline Geliştirme
Bugün aldığım tanıtım derslerinin devamında, pipeline ve pipeline geliştirme ile ilgili ayrıntılı bilgiler aldım. Pipeline, aslında bildiğimiz boru hattı demek, örneğin borularla petrolün bir yerden başka bir yere taşınması için kullanılan sistem. Bunun bilgisayar terminolojisinde anlamı ise bir elementin çıktısı, diğerinin girdisi olacak şekilde oluşturulmuş işleme elementleri zinciri. Böylece çok daha komplike işlemler pipeline oluşturularak, kolay ve düzenli bir biçimde gerçekleştiriliyor. Sanırım pipeline Türkçeye ardışık düzen olarak çevriliyor, gene de ben pipeline olarak kullanacağım.
Tag: heidelberg
Blog
WWW2HUSAR - HUSAR'ın Web Arayüzü
Stajımın ikinci gününde HUSAR’ın web arayüzünü konuştuk. HUSAR komut isteminden komutlarla kullanılabilen, yönetilebilen bir yazılım ancak bunu kolaylaştırmak için hazırlanmış bir web arayüzü var. WWW2HUSAR adını verdikleri bu arayüz ile listelenen araçları kolayca seçebiliyor, genetik dizinizi ekleyebiliyor ve başka birçok işlemi kolayca, birkaç tık ile yapabiliyorsunuz.
Bununla birlikte biraz daha HUSAR’ın işlevlerine göz attık. Yazılımda, yerel klasörde gen dizisi listeleri oluşturarak, bunları çoklu dizi hizalama (multiple sequence alignment) aracı ile genlerin benzerliklerini karşılastırabiliyor ve örneğin evrimsel ilişkilerini ortaya çıkarabiliyorsunuz.
Blog
DKFZ - Heidelberg Biyoenformatik Birimi'nde Staj
Erasmus programıyla yapıyor olduğum yaz stajı başladı. İlk olarak birimi yöneten bilim insanlarından birkaç saatlik tanıtım dersi aldım. Bu derste birimin kısa tarihi, birimin günümüze kadar yaptıkları projeler ve bunlarin ayrintilari konusunda bilgiler aldım.
Biyoenformatik Birimi DKFZ’nin (Deutsches Krebsforschungszentrum – ing. German Cancer Research Center) bir çekirdek tesisi olan Genomik ve Proteomik Çekirdek Tesisi’ne bağlı bir grup. İsimleri aynı zamanda HUSAR (Heidelberg Unix Sequence Analysis Resources) ve bu isim grubun geliştirdiği dizi analizi yapma paketinin de adı olarak kullanılıyor.
Tag: www to husar
Blog
WWW2HUSAR - HUSAR'ın Web Arayüzü
Stajımın ikinci gününde HUSAR’ın web arayüzünü konuştuk. HUSAR komut isteminden komutlarla kullanılabilen, yönetilebilen bir yazılım ancak bunu kolaylaştırmak için hazırlanmış bir web arayüzü var. WWW2HUSAR adını verdikleri bu arayüz ile listelenen araçları kolayca seçebiliyor, genetik dizinizi ekleyebiliyor ve başka birçok işlemi kolayca, birkaç tık ile yapabiliyorsunuz.
Bununla birlikte biraz daha HUSAR’ın işlevlerine göz attık. Yazılımda, yerel klasörde gen dizisi listeleri oluşturarak, bunları çoklu dizi hizalama (multiple sequence alignment) aracı ile genlerin benzerliklerini karşılastırabiliyor ve örneğin evrimsel ilişkilerini ortaya çıkarabiliyorsunuz.
Tag: www2husar
Blog
WWW2HUSAR - HUSAR'ın Web Arayüzü
Stajımın ikinci gününde HUSAR’ın web arayüzünü konuştuk. HUSAR komut isteminden komutlarla kullanılabilen, yönetilebilen bir yazılım ancak bunu kolaylaştırmak için hazırlanmış bir web arayüzü var. WWW2HUSAR adını verdikleri bu arayüz ile listelenen araçları kolayca seçebiliyor, genetik dizinizi ekleyebiliyor ve başka birçok işlemi kolayca, birkaç tık ile yapabiliyorsunuz.
Bununla birlikte biraz daha HUSAR’ın işlevlerine göz attık. Yazılımda, yerel klasörde gen dizisi listeleri oluşturarak, bunları çoklu dizi hizalama (multiple sequence alignment) aracı ile genlerin benzerliklerini karşılastırabiliyor ve örneğin evrimsel ilişkilerini ortaya çıkarabiliyorsunuz.
Tag: centos
Blog
DKFZ - Heidelberg Biyoenformatik Birimi'nde Staj
Erasmus programıyla yapıyor olduğum yaz stajı başladı. İlk olarak birimi yöneten bilim insanlarından birkaç saatlik tanıtım dersi aldım. Bu derste birimin kısa tarihi, birimin günümüze kadar yaptıkları projeler ve bunlarin ayrintilari konusunda bilgiler aldım.
Biyoenformatik Birimi DKFZ’nin (Deutsches Krebsforschungszentrum – ing. German Cancer Research Center) bir çekirdek tesisi olan Genomik ve Proteomik Çekirdek Tesisi’ne bağlı bir grup. İsimleri aynı zamanda HUSAR (Heidelberg Unix Sequence Analysis Resources) ve bu isim grubun geliştirdiği dizi analizi yapma paketinin de adı olarak kullanılıyor.
Tag: german cancer research center
Blog
DKFZ - Heidelberg Biyoenformatik Birimi'nde Staj
Erasmus programıyla yapıyor olduğum yaz stajı başladı. İlk olarak birimi yöneten bilim insanlarından birkaç saatlik tanıtım dersi aldım. Bu derste birimin kısa tarihi, birimin günümüze kadar yaptıkları projeler ve bunlarin ayrintilari konusunda bilgiler aldım.
Biyoenformatik Birimi DKFZ’nin (Deutsches Krebsforschungszentrum – ing. German Cancer Research Center) bir çekirdek tesisi olan Genomik ve Proteomik Çekirdek Tesisi’ne bağlı bir grup. İsimleri aynı zamanda HUSAR (Heidelberg Unix Sequence Analysis Resources) ve bu isim grubun geliştirdiği dizi analizi yapma paketinin de adı olarak kullanılıyor.
Tag: biyoinformatik
Blog
Biyoinformatik mi? Yoksa Biyoenformatik mi?
Yazılarıma konu ararken kitaplarla birlikte interneti de karıştırıyorum. Yabancı kaynaklar elbette fazlaca var ve yeterliler, ancak Türkçe kaynaklara baktığımda ilk gözüme çarpan bu alanın isminin farklı kullanımları oldu.
Biliyorsunuz, İngilizcede bu alana bioinformatics deniyor. Gayet normal, çünkü İngilizcede informatics ics eki ile birlikte information sözcüğünden geliyor. Bu sözcük ise Latince kökene sahip1. Enformatik sözcüğü Türkçeye, Fransızcadan informatique sözcüğünden, enformatik olarak gelmiş, ayrıca bilişim olarak da Türkçesi önerilmiş2. Elbette bu Fransızca sözcük de İngilizcesi ile aynı kökene sahip.
Tag: enformasyon
Blog
Biyoinformatik mi? Yoksa Biyoenformatik mi?
Yazılarıma konu ararken kitaplarla birlikte interneti de karıştırıyorum. Yabancı kaynaklar elbette fazlaca var ve yeterliler, ancak Türkçe kaynaklara baktığımda ilk gözüme çarpan bu alanın isminin farklı kullanımları oldu.
Biliyorsunuz, İngilizcede bu alana bioinformatics deniyor. Gayet normal, çünkü İngilizcede informatics ics eki ile birlikte information sözcüğünden geliyor. Bu sözcük ise Latince kökene sahip1. Enformatik sözcüğü Türkçeye, Fransızcadan informatique sözcüğünden, enformatik olarak gelmiş, ayrıca bilişim olarak da Türkçesi önerilmiş2. Elbette bu Fransızca sözcük de İngilizcesi ile aynı kökene sahip.
Tag: etimoloji
Blog
Biyoinformatik mi? Yoksa Biyoenformatik mi?
Yazılarıma konu ararken kitaplarla birlikte interneti de karıştırıyorum. Yabancı kaynaklar elbette fazlaca var ve yeterliler, ancak Türkçe kaynaklara baktığımda ilk gözüme çarpan bu alanın isminin farklı kullanımları oldu.
Biliyorsunuz, İngilizcede bu alana bioinformatics deniyor. Gayet normal, çünkü İngilizcede informatics ics eki ile birlikte information sözcüğünden geliyor. Bu sözcük ise Latince kökene sahip1. Enformatik sözcüğü Türkçeye, Fransızcadan informatique sözcüğünden, enformatik olarak gelmiş, ayrıca bilişim olarak da Türkçesi önerilmiş2. Elbette bu Fransızca sözcük de İngilizcesi ile aynı kökene sahip.
Tag: fransızca
Blog
Biyoinformatik mi? Yoksa Biyoenformatik mi?
Yazılarıma konu ararken kitaplarla birlikte interneti de karıştırıyorum. Yabancı kaynaklar elbette fazlaca var ve yeterliler, ancak Türkçe kaynaklara baktığımda ilk gözüme çarpan bu alanın isminin farklı kullanımları oldu.
Biliyorsunuz, İngilizcede bu alana bioinformatics deniyor. Gayet normal, çünkü İngilizcede informatics ics eki ile birlikte information sözcüğünden geliyor. Bu sözcük ise Latince kökene sahip1. Enformatik sözcüğü Türkçeye, Fransızcadan informatique sözcüğünden, enformatik olarak gelmiş, ayrıca bilişim olarak da Türkçesi önerilmiş2. Elbette bu Fransızca sözcük de İngilizcesi ile aynı kökene sahip.
Tag: ingilizce
Blog
Biyoinformatik mi? Yoksa Biyoenformatik mi?
Yazılarıma konu ararken kitaplarla birlikte interneti de karıştırıyorum. Yabancı kaynaklar elbette fazlaca var ve yeterliler, ancak Türkçe kaynaklara baktığımda ilk gözüme çarpan bu alanın isminin farklı kullanımları oldu.
Biliyorsunuz, İngilizcede bu alana bioinformatics deniyor. Gayet normal, çünkü İngilizcede informatics ics eki ile birlikte information sözcüğünden geliyor. Bu sözcük ise Latince kökene sahip1. Enformatik sözcüğü Türkçeye, Fransızcadan informatique sözcüğünden, enformatik olarak gelmiş, ayrıca bilişim olarak da Türkçesi önerilmiş2. Elbette bu Fransızca sözcük de İngilizcesi ile aynı kökene sahip.
Tag: latince
Blog
Biyoinformatik mi? Yoksa Biyoenformatik mi?
Yazılarıma konu ararken kitaplarla birlikte interneti de karıştırıyorum. Yabancı kaynaklar elbette fazlaca var ve yeterliler, ancak Türkçe kaynaklara baktığımda ilk gözüme çarpan bu alanın isminin farklı kullanımları oldu.
Biliyorsunuz, İngilizcede bu alana bioinformatics deniyor. Gayet normal, çünkü İngilizcede informatics ics eki ile birlikte information sözcüğünden geliyor. Bu sözcük ise Latince kökene sahip1. Enformatik sözcüğü Türkçeye, Fransızcadan informatique sözcüğünden, enformatik olarak gelmiş, ayrıca bilişim olarak da Türkçesi önerilmiş2. Elbette bu Fransızca sözcük de İngilizcesi ile aynı kökene sahip.
Tag: tdk
Blog
Biyoinformatik mi? Yoksa Biyoenformatik mi?
Yazılarıma konu ararken kitaplarla birlikte interneti de karıştırıyorum. Yabancı kaynaklar elbette fazlaca var ve yeterliler, ancak Türkçe kaynaklara baktığımda ilk gözüme çarpan bu alanın isminin farklı kullanımları oldu.
Biliyorsunuz, İngilizcede bu alana bioinformatics deniyor. Gayet normal, çünkü İngilizcede informatics ics eki ile birlikte information sözcüğünden geliyor. Bu sözcük ise Latince kökene sahip1. Enformatik sözcüğü Türkçeye, Fransızcadan informatique sözcüğünden, enformatik olarak gelmiş, ayrıca bilişim olarak da Türkçesi önerilmiş2. Elbette bu Fransızca sözcük de İngilizcesi ile aynı kökene sahip.
Tag: türkçe
Blog
Biyoinformatik mi? Yoksa Biyoenformatik mi?
Yazılarıma konu ararken kitaplarla birlikte interneti de karıştırıyorum. Yabancı kaynaklar elbette fazlaca var ve yeterliler, ancak Türkçe kaynaklara baktığımda ilk gözüme çarpan bu alanın isminin farklı kullanımları oldu.
Biliyorsunuz, İngilizcede bu alana bioinformatics deniyor. Gayet normal, çünkü İngilizcede informatics ics eki ile birlikte information sözcüğünden geliyor. Bu sözcük ise Latince kökene sahip1. Enformatik sözcüğü Türkçeye, Fransızcadan informatique sözcüğünden, enformatik olarak gelmiş, ayrıca bilişim olarak da Türkçesi önerilmiş2. Elbette bu Fransızca sözcük de İngilizcesi ile aynı kökene sahip.
Tag: wiktionary
Blog
Biyoinformatik mi? Yoksa Biyoenformatik mi?
Yazılarıma konu ararken kitaplarla birlikte interneti de karıştırıyorum. Yabancı kaynaklar elbette fazlaca var ve yeterliler, ancak Türkçe kaynaklara baktığımda ilk gözüme çarpan bu alanın isminin farklı kullanımları oldu.
Biliyorsunuz, İngilizcede bu alana bioinformatics deniyor. Gayet normal, çünkü İngilizcede informatics ics eki ile birlikte information sözcüğünden geliyor. Bu sözcük ise Latince kökene sahip1. Enformatik sözcüğü Türkçeye, Fransızcadan informatique sözcüğünden, enformatik olarak gelmiş, ayrıca bilişim olarak da Türkçesi önerilmiş2. Elbette bu Fransızca sözcük de İngilizcesi ile aynı kökene sahip.
Tag: hesaplamalı biyoloji
Blog
7th International Symposium on Health Informatics and Bioinformatics
7. Sağlık Enformatiği ve Biyoenformatik üzerine Uluslararası Sempozyumu, 7th International Symposium on Health Informatics and Bioinformatics (HIBIT 2012), ilk kez 2005’te ODTÜ Enformatik Enstitüsü tarafından düzenlenmiş ve Sağlık Enformatiği, Tıbbi Enformatik, Hesaplamalı Biyoloji ve Biyoenformatik alanlarında akademisyenleri ve araştırmacıları bir araya getirmeyi ve bu alanlar hakkında yapılan çalışmaların sunulmasına ortam sağlamayı ve çalışmalar üzerine interaktif bir şekilde değerlendirmeler yapmayı amaçlamaktadır.
Bu sene, 19-22 Nisan 2012’de Ürgüp, Nevşehir Perissia Hotel’de düzenlenecek olan HIBIT 2012 organizasyonu ODTÜ, ODTÜ Enformatik Enstitüsü, ODTÜ Biyolojik Bilimler Bölümü ve ODTÜ Bilgisayar Mühendisliği Bölümü partnerliği ile gerçekleştirilmektedir.
Tag: hibit
Blog
7th International Symposium on Health Informatics and Bioinformatics
7. Sağlık Enformatiği ve Biyoenformatik üzerine Uluslararası Sempozyumu, 7th International Symposium on Health Informatics and Bioinformatics (HIBIT 2012), ilk kez 2005’te ODTÜ Enformatik Enstitüsü tarafından düzenlenmiş ve Sağlık Enformatiği, Tıbbi Enformatik, Hesaplamalı Biyoloji ve Biyoenformatik alanlarında akademisyenleri ve araştırmacıları bir araya getirmeyi ve bu alanlar hakkında yapılan çalışmaların sunulmasına ortam sağlamayı ve çalışmalar üzerine interaktif bir şekilde değerlendirmeler yapmayı amaçlamaktadır.
Bu sene, 19-22 Nisan 2012’de Ürgüp, Nevşehir Perissia Hotel’de düzenlenecek olan HIBIT 2012 organizasyonu ODTÜ, ODTÜ Enformatik Enstitüsü, ODTÜ Biyolojik Bilimler Bölümü ve ODTÜ Bilgisayar Mühendisliği Bölümü partnerliği ile gerçekleştirilmektedir.
Tag: odtü bilgisayar
Blog
7th International Symposium on Health Informatics and Bioinformatics
7. Sağlık Enformatiği ve Biyoenformatik üzerine Uluslararası Sempozyumu, 7th International Symposium on Health Informatics and Bioinformatics (HIBIT 2012), ilk kez 2005’te ODTÜ Enformatik Enstitüsü tarafından düzenlenmiş ve Sağlık Enformatiği, Tıbbi Enformatik, Hesaplamalı Biyoloji ve Biyoenformatik alanlarında akademisyenleri ve araştırmacıları bir araya getirmeyi ve bu alanlar hakkında yapılan çalışmaların sunulmasına ortam sağlamayı ve çalışmalar üzerine interaktif bir şekilde değerlendirmeler yapmayı amaçlamaktadır.
Bu sene, 19-22 Nisan 2012’de Ürgüp, Nevşehir Perissia Hotel’de düzenlenecek olan HIBIT 2012 organizasyonu ODTÜ, ODTÜ Enformatik Enstitüsü, ODTÜ Biyolojik Bilimler Bölümü ve ODTÜ Bilgisayar Mühendisliği Bölümü partnerliği ile gerçekleştirilmektedir.
Tag: odtü biyoenformatik
Blog
7th International Symposium on Health Informatics and Bioinformatics
7. Sağlık Enformatiği ve Biyoenformatik üzerine Uluslararası Sempozyumu, 7th International Symposium on Health Informatics and Bioinformatics (HIBIT 2012), ilk kez 2005’te ODTÜ Enformatik Enstitüsü tarafından düzenlenmiş ve Sağlık Enformatiği, Tıbbi Enformatik, Hesaplamalı Biyoloji ve Biyoenformatik alanlarında akademisyenleri ve araştırmacıları bir araya getirmeyi ve bu alanlar hakkında yapılan çalışmaların sunulmasına ortam sağlamayı ve çalışmalar üzerine interaktif bir şekilde değerlendirmeler yapmayı amaçlamaktadır.
Bu sene, 19-22 Nisan 2012’de Ürgüp, Nevşehir Perissia Hotel’de düzenlenecek olan HIBIT 2012 organizasyonu ODTÜ, ODTÜ Enformatik Enstitüsü, ODTÜ Biyolojik Bilimler Bölümü ve ODTÜ Bilgisayar Mühendisliği Bölümü partnerliği ile gerçekleştirilmektedir.
Tag: odtü biyoloji
Blog
7th International Symposium on Health Informatics and Bioinformatics
7. Sağlık Enformatiği ve Biyoenformatik üzerine Uluslararası Sempozyumu, 7th International Symposium on Health Informatics and Bioinformatics (HIBIT 2012), ilk kez 2005’te ODTÜ Enformatik Enstitüsü tarafından düzenlenmiş ve Sağlık Enformatiği, Tıbbi Enformatik, Hesaplamalı Biyoloji ve Biyoenformatik alanlarında akademisyenleri ve araştırmacıları bir araya getirmeyi ve bu alanlar hakkında yapılan çalışmaların sunulmasına ortam sağlamayı ve çalışmalar üzerine interaktif bir şekilde değerlendirmeler yapmayı amaçlamaktadır.
Bu sene, 19-22 Nisan 2012’de Ürgüp, Nevşehir Perissia Hotel’de düzenlenecek olan HIBIT 2012 organizasyonu ODTÜ, ODTÜ Enformatik Enstitüsü, ODTÜ Biyolojik Bilimler Bölümü ve ODTÜ Bilgisayar Mühendisliği Bölümü partnerliği ile gerçekleştirilmektedir.
Tag: odtü moleküler biyoloji ve genetik
Blog
7th International Symposium on Health Informatics and Bioinformatics
7. Sağlık Enformatiği ve Biyoenformatik üzerine Uluslararası Sempozyumu, 7th International Symposium on Health Informatics and Bioinformatics (HIBIT 2012), ilk kez 2005’te ODTÜ Enformatik Enstitüsü tarafından düzenlenmiş ve Sağlık Enformatiği, Tıbbi Enformatik, Hesaplamalı Biyoloji ve Biyoenformatik alanlarında akademisyenleri ve araştırmacıları bir araya getirmeyi ve bu alanlar hakkında yapılan çalışmaların sunulmasına ortam sağlamayı ve çalışmalar üzerine interaktif bir şekilde değerlendirmeler yapmayı amaçlamaktadır.
Bu sene, 19-22 Nisan 2012’de Ürgüp, Nevşehir Perissia Hotel’de düzenlenecek olan HIBIT 2012 organizasyonu ODTÜ, ODTÜ Enformatik Enstitüsü, ODTÜ Biyolojik Bilimler Bölümü ve ODTÜ Bilgisayar Mühendisliği Bölümü partnerliği ile gerçekleştirilmektedir.
Tag: tıbbi enformatik
Blog
7th International Symposium on Health Informatics and Bioinformatics
7. Sağlık Enformatiği ve Biyoenformatik üzerine Uluslararası Sempozyumu, 7th International Symposium on Health Informatics and Bioinformatics (HIBIT 2012), ilk kez 2005’te ODTÜ Enformatik Enstitüsü tarafından düzenlenmiş ve Sağlık Enformatiği, Tıbbi Enformatik, Hesaplamalı Biyoloji ve Biyoenformatik alanlarında akademisyenleri ve araştırmacıları bir araya getirmeyi ve bu alanlar hakkında yapılan çalışmaların sunulmasına ortam sağlamayı ve çalışmalar üzerine interaktif bir şekilde değerlendirmeler yapmayı amaçlamaktadır.
Bu sene, 19-22 Nisan 2012’de Ürgüp, Nevşehir Perissia Hotel’de düzenlenecek olan HIBIT 2012 organizasyonu ODTÜ, ODTÜ Enformatik Enstitüsü, ODTÜ Biyolojik Bilimler Bölümü ve ODTÜ Bilgisayar Mühendisliği Bölümü partnerliği ile gerçekleştirilmektedir.
Tag: bilgisayar
Blog
Biyoenformatik Nedir? Biyoenformatik'in Tanımı
Birçok organizmanın ve son olarak da 2001’de insan genomunun çıkarılmasıyla, tüm 3 milyar baz çiftinin diziliminin elde edilmesiyle, karşımıza bu bilgiyi farklı şekillerde kullanacak olan alanlar çıktı.Bu genleri anlamaya çalışan, bu genlerden oluşacak proteinleri belirlemeye çalışan alanların yanında bu bilginin analizini yapma ihtiyacı da Biyoenformatik alanını doğurdu.
Biyoenformatik, biyolojik bilginin bilgisayarlar ve istatistiksel teknikler kullanılarak analiz edilmesidir; başka bir deyişle, biyoenformatik, biyolojik araştırmaları iyileştirmek ve hızlandırmak için bilgisayar veri tabanları ve algoritmaları geliştirme ve onlardan yarar sağlama bilimidir [1].
Tag: bilgisayar bilimi
Blog
Biyoenformatik Nedir? Biyoenformatik'in Tanımı
Birçok organizmanın ve son olarak da 2001’de insan genomunun çıkarılmasıyla, tüm 3 milyar baz çiftinin diziliminin elde edilmesiyle, karşımıza bu bilgiyi farklı şekillerde kullanacak olan alanlar çıktı.Bu genleri anlamaya çalışan, bu genlerden oluşacak proteinleri belirlemeye çalışan alanların yanında bu bilginin analizini yapma ihtiyacı da Biyoenformatik alanını doğurdu.
Biyoenformatik, biyolojik bilginin bilgisayarlar ve istatistiksel teknikler kullanılarak analiz edilmesidir; başka bir deyişle, biyoenformatik, biyolojik araştırmaları iyileştirmek ve hızlandırmak için bilgisayar veri tabanları ve algoritmaları geliştirme ve onlardan yarar sağlama bilimidir [1].
Tag: bilim
Blog
Biyoenformatik Nedir? Biyoenformatik'in Tanımı
Birçok organizmanın ve son olarak da 2001’de insan genomunun çıkarılmasıyla, tüm 3 milyar baz çiftinin diziliminin elde edilmesiyle, karşımıza bu bilgiyi farklı şekillerde kullanacak olan alanlar çıktı.Bu genleri anlamaya çalışan, bu genlerden oluşacak proteinleri belirlemeye çalışan alanların yanında bu bilginin analizini yapma ihtiyacı da Biyoenformatik alanını doğurdu.
Biyoenformatik, biyolojik bilginin bilgisayarlar ve istatistiksel teknikler kullanılarak analiz edilmesidir; başka bir deyişle, biyoenformatik, biyolojik araştırmaları iyileştirmek ve hızlandırmak için bilgisayar veri tabanları ve algoritmaları geliştirme ve onlardan yarar sağlama bilimidir [1].
Tag: bilişim
Blog
Biyoenformatik Nedir? Biyoenformatik'in Tanımı
Birçok organizmanın ve son olarak da 2001’de insan genomunun çıkarılmasıyla, tüm 3 milyar baz çiftinin diziliminin elde edilmesiyle, karşımıza bu bilgiyi farklı şekillerde kullanacak olan alanlar çıktı.Bu genleri anlamaya çalışan, bu genlerden oluşacak proteinleri belirlemeye çalışan alanların yanında bu bilginin analizini yapma ihtiyacı da Biyoenformatik alanını doğurdu.
Biyoenformatik, biyolojik bilginin bilgisayarlar ve istatistiksel teknikler kullanılarak analiz edilmesidir; başka bir deyişle, biyoenformatik, biyolojik araştırmaları iyileştirmek ve hızlandırmak için bilgisayar veri tabanları ve algoritmaları geliştirme ve onlardan yarar sağlama bilimidir [1].
Tag: bilişim teknolojisi
Blog
Biyoenformatik Nedir? Biyoenformatik'in Tanımı
Birçok organizmanın ve son olarak da 2001’de insan genomunun çıkarılmasıyla, tüm 3 milyar baz çiftinin diziliminin elde edilmesiyle, karşımıza bu bilgiyi farklı şekillerde kullanacak olan alanlar çıktı.Bu genleri anlamaya çalışan, bu genlerden oluşacak proteinleri belirlemeye çalışan alanların yanında bu bilginin analizini yapma ihtiyacı da Biyoenformatik alanını doğurdu.
Biyoenformatik, biyolojik bilginin bilgisayarlar ve istatistiksel teknikler kullanılarak analiz edilmesidir; başka bir deyişle, biyoenformatik, biyolojik araştırmaları iyileştirmek ve hızlandırmak için bilgisayar veri tabanları ve algoritmaları geliştirme ve onlardan yarar sağlama bilimidir [1].
Tag: biyoenformatik nedir
Blog
Biyoenformatik Nedir? Biyoenformatik'in Tanımı
Birçok organizmanın ve son olarak da 2001’de insan genomunun çıkarılmasıyla, tüm 3 milyar baz çiftinin diziliminin elde edilmesiyle, karşımıza bu bilgiyi farklı şekillerde kullanacak olan alanlar çıktı.Bu genleri anlamaya çalışan, bu genlerden oluşacak proteinleri belirlemeye çalışan alanların yanında bu bilginin analizini yapma ihtiyacı da Biyoenformatik alanını doğurdu.
Biyoenformatik, biyolojik bilginin bilgisayarlar ve istatistiksel teknikler kullanılarak analiz edilmesidir; başka bir deyişle, biyoenformatik, biyolojik araştırmaları iyileştirmek ve hızlandırmak için bilgisayar veri tabanları ve algoritmaları geliştirme ve onlardan yarar sağlama bilimidir [1].
Tag: biyoenformatik tanımı
Blog
Biyoenformatik Nedir? Biyoenformatik'in Tanımı
Birçok organizmanın ve son olarak da 2001’de insan genomunun çıkarılmasıyla, tüm 3 milyar baz çiftinin diziliminin elde edilmesiyle, karşımıza bu bilgiyi farklı şekillerde kullanacak olan alanlar çıktı.Bu genleri anlamaya çalışan, bu genlerden oluşacak proteinleri belirlemeye çalışan alanların yanında bu bilginin analizini yapma ihtiyacı da Biyoenformatik alanını doğurdu.
Biyoenformatik, biyolojik bilginin bilgisayarlar ve istatistiksel teknikler kullanılarak analiz edilmesidir; başka bir deyişle, biyoenformatik, biyolojik araştırmaları iyileştirmek ve hızlandırmak için bilgisayar veri tabanları ve algoritmaları geliştirme ve onlardan yarar sağlama bilimidir [1].
Tag: biyoloji
Blog
Biyoenformatik Nedir? Biyoenformatik'in Tanımı
Birçok organizmanın ve son olarak da 2001’de insan genomunun çıkarılmasıyla, tüm 3 milyar baz çiftinin diziliminin elde edilmesiyle, karşımıza bu bilgiyi farklı şekillerde kullanacak olan alanlar çıktı.Bu genleri anlamaya çalışan, bu genlerden oluşacak proteinleri belirlemeye çalışan alanların yanında bu bilginin analizini yapma ihtiyacı da Biyoenformatik alanını doğurdu.
Biyoenformatik, biyolojik bilginin bilgisayarlar ve istatistiksel teknikler kullanılarak analiz edilmesidir; başka bir deyişle, biyoenformatik, biyolojik araştırmaları iyileştirmek ve hızlandırmak için bilgisayar veri tabanları ve algoritmaları geliştirme ve onlardan yarar sağlama bilimidir [1].
Tag: biyoteknoloji
Blog
Biyoenformatik Nedir? Biyoenformatik'in Tanımı
Birçok organizmanın ve son olarak da 2001’de insan genomunun çıkarılmasıyla, tüm 3 milyar baz çiftinin diziliminin elde edilmesiyle, karşımıza bu bilgiyi farklı şekillerde kullanacak olan alanlar çıktı.Bu genleri anlamaya çalışan, bu genlerden oluşacak proteinleri belirlemeye çalışan alanların yanında bu bilginin analizini yapma ihtiyacı da Biyoenformatik alanını doğurdu.
Biyoenformatik, biyolojik bilginin bilgisayarlar ve istatistiksel teknikler kullanılarak analiz edilmesidir; başka bir deyişle, biyoenformatik, biyolojik araştırmaları iyileştirmek ve hızlandırmak için bilgisayar veri tabanları ve algoritmaları geliştirme ve onlardan yarar sağlama bilimidir [1].
Tag: nih
Blog
Biyoenformatik Nedir? Biyoenformatik'in Tanımı
Birçok organizmanın ve son olarak da 2001’de insan genomunun çıkarılmasıyla, tüm 3 milyar baz çiftinin diziliminin elde edilmesiyle, karşımıza bu bilgiyi farklı şekillerde kullanacak olan alanlar çıktı.Bu genleri anlamaya çalışan, bu genlerden oluşacak proteinleri belirlemeye çalışan alanların yanında bu bilginin analizini yapma ihtiyacı da Biyoenformatik alanını doğurdu.
Biyoenformatik, biyolojik bilginin bilgisayarlar ve istatistiksel teknikler kullanılarak analiz edilmesidir; başka bir deyişle, biyoenformatik, biyolojik araştırmaları iyileştirmek ve hızlandırmak için bilgisayar veri tabanları ve algoritmaları geliştirme ve onlardan yarar sağlama bilimidir [1].
Tag: teknoloji
Blog
Biyoenformatik Nedir? Biyoenformatik'in Tanımı
Birçok organizmanın ve son olarak da 2001’de insan genomunun çıkarılmasıyla, tüm 3 milyar baz çiftinin diziliminin elde edilmesiyle, karşımıza bu bilgiyi farklı şekillerde kullanacak olan alanlar çıktı.Bu genleri anlamaya çalışan, bu genlerden oluşacak proteinleri belirlemeye çalışan alanların yanında bu bilginin analizini yapma ihtiyacı da Biyoenformatik alanını doğurdu.
Biyoenformatik, biyolojik bilginin bilgisayarlar ve istatistiksel teknikler kullanılarak analiz edilmesidir; başka bir deyişle, biyoenformatik, biyolojik araştırmaları iyileştirmek ve hızlandırmak için bilgisayar veri tabanları ve algoritmaları geliştirme ve onlardan yarar sağlama bilimidir [1].
Tag: tıp
Blog
Biyoenformatik Nedir? Biyoenformatik'in Tanımı
Birçok organizmanın ve son olarak da 2001’de insan genomunun çıkarılmasıyla, tüm 3 milyar baz çiftinin diziliminin elde edilmesiyle, karşımıza bu bilgiyi farklı şekillerde kullanacak olan alanlar çıktı.Bu genleri anlamaya çalışan, bu genlerden oluşacak proteinleri belirlemeye çalışan alanların yanında bu bilginin analizini yapma ihtiyacı da Biyoenformatik alanını doğurdu.
Biyoenformatik, biyolojik bilginin bilgisayarlar ve istatistiksel teknikler kullanılarak analiz edilmesidir; başka bir deyişle, biyoenformatik, biyolojik araştırmaları iyileştirmek ve hızlandırmak için bilgisayar veri tabanları ve algoritmaları geliştirme ve onlardan yarar sağlama bilimidir [1].
Tag: giriş yazısı
Blog
Hoş Geldim! Hoş Geldiniz!
Merhabalar,
Biyoloji alanında özel olarak ilgi alanım olan ve daha fazla keşfetmem, üzerine çok şey öğrenmem gereken Biyoenformatik’i, bu blog aracılığıyla (olası ziyaretçilerimle birlikte) öğreneceğim. İlk yazımı biraz önce Biyoenformatik’in çeşitli otoriteler tarafından yapılan tanımları ile tamamladım. Daha sonra, Biyoenformatik’te geçen birçok ilkelerin tanımlarından da bahsetmek istiyorum. Ayrıca, Biyoenformatik hakkında yazılım dilleri, istatiksel yöntemler de yazılarımın konularını oluşturacak. Aynı zamanda Biyoenformatik ile ilgili haberlere de yer vermek ve bu haberlerle en son gelişmeleri takip etmeyi (ettirmeyi) planlıyorum.
Tag: weblog
Blog
Hoş Geldim! Hoş Geldiniz!
Merhabalar,
Biyoloji alanında özel olarak ilgi alanım olan ve daha fazla keşfetmem, üzerine çok şey öğrenmem gereken Biyoenformatik’i, bu blog aracılığıyla (olası ziyaretçilerimle birlikte) öğreneceğim. İlk yazımı biraz önce Biyoenformatik’in çeşitli otoriteler tarafından yapılan tanımları ile tamamladım. Daha sonra, Biyoenformatik’te geçen birçok ilkelerin tanımlarından da bahsetmek istiyorum. Ayrıca, Biyoenformatik hakkında yazılım dilleri, istatiksel yöntemler de yazılarımın konularını oluşturacak. Aynı zamanda Biyoenformatik ile ilgili haberlere de yer vermek ve bu haberlerle en son gelişmeleri takip etmeyi (ettirmeyi) planlıyorum.