Software development projects for Molecular Biology


  1. GCK2.5 debug under wine There are still some bugs that make using gck2.5 sometimes a pain under wine. Especially the inability to annotate regions, to search for a sequence and to open a new file.
  2. GCK2.5 export GCK2.5 is not able to export in embl format with the regions converted into features. It can, however, export comments to text file and plain sequence to a text file. It should be trivial to write a perl script that takes these two files and converts them into one embl file, EMBOSS cirdna/lindna or pDRAW32 file.
  3. GCK2.5/wine desktop integration When clicking on files that are associated with Windows programs (using wine), the Linux file manager (e.g. Konqueror) passes the file as an argument to the associated Windows application and the file is opened under wine. However GCK2.5 refuses to accept the file as an argument. When clicking on a .gcc file, GCK2.5 starts up, but opens an empty window and I have to open the .gcc file from within GCK2.5. Unnecessary clicking, especially when I need to navigate over several folder hierachies. When GCK2.5 is running natively under Windows, is it possible to start GCK2.5 with a construct file as a command line argument? I should check that out.


  1. pDRAW32 debug for wine
  2. Write a file format plug-in for EMBOSS. EMBOSS should be able to use pDRAW32 files (which are plain text files) as input and output files.
  3. Write a perl-script converter for embl/GCK2.5/pDRAW32


  1. Make a Suse Linux 9 rpm for staden. As starting point to make rpms one can take both the source or the binary files. Mostly I would like to integrate Staden better with the KDE file manager (so I need to create mime types and file associations, icons and losts of .desktop files).
  2. Essentially the same as in 1, but for MacOSX. Anders Nister has made a package that sometimes installs correctly. But no desktip integration. Some clickable icons and file associations would be nice. Unfortunately you cannot click on shell scripts nor drop files onto them. I was experimenting with Platypus to wrap the shell scripts I was writing, but drop support didn't work with Platypus. Also it is unclear to me how to make MacOSX to recognize that more than only .ab1 files can be opened with e.g. pregap4. It is easy to make MacOSX to associate a new file extension (I tried .ab1) with a application (I used a cocoa-wrapped shell script) by editing its info.plist file (ctrl-click -> show package contents, etc., it is an XML file that opens a quite friendly XML editor where youc an edit this type of stuff; you can just take another info.plist file as a template to see what options you have). I associated my cocoas wrapper with all files ending with .ab1, and gave them a specific icon and name. This works. But when I added another ending (I tried .txt because plain sequence files should be readable by pregap4), it didn't work anymore. I expected that MacOS X would display now in the "Open with" menu different options (such as: TextEdit, pregap, etc.), but instead there was nothing... Apparently MacOSX differs quite much from e.g. KDE in the way how files types and applications are associated


  1. Improvement of EMBOSS sequence manipulation tools (maintain annotation)
  2. Write a perl-script converter for embl/GCK2.5/pDRAW32


  1. Bibliography management system. The main reason, why researchers in life sciences use Microsoft Office is the availability of relatively good bibliography management systems. Most people I know use EndNote (but there are other very good tools around). However, although there are several open source bibliography management systems around, none of them comes close to the commercial ones, especially when the integration with Open Office is concerned. There are people working on a bibliography management system for Open Office, but apparently they are only a few people working on this module and thus it will take time... Of course if you use Latex, you have a very good tool, but it is futile to convince 300,000 life science researchers to learn Latex. There is also a reference management system for Linux called SixPack, but I have not tried it and I think it is not anymore under developement.