Information

Merge NCBI and Ensembl data

Merge NCBI and Ensembl data


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Apologies if this is a really naive question, but I cannot figure out how to do this easily. Here is a related post regarding the best method to find orthologous genes of a species.


Let's say I have a protein alignment downloaded from Ensembl (coming, for instance, from this Ensembl tree).

This gene is present in some other "NCBI species" that I would like to include in my tree (for instance, Stegastes partitus, with available genome and present in the NCBI database but NOT in the Ensembl database). Indeed, if I manually blastp asip protein sequence of D. rerio (extracted from my Ensembl multifasta protein alignment) onto nr database parsed for S. partitus, I find this sequence, corresponding to the first blast hit. Perfect! And I can manually append it to my initial protein tree.

Where the problem starts is that I don't have one gene and one NCBI species but many of them (let's saypgenes andnNCBI species). I already have an Ensembl protein multifasta file for each of mypgenes.

My question is: is there an easy way to append to each of mypmultifasta files the corresponding homologous protein sequence(s) for then"NCBI species"?

Thanks for any insight!


Watch the video: How to download multiple sequences from NCBI- Lecture 1 Phylogenetic analysis (May 2022).