Opening old plasmid map files

GCK installation disk

For a new cloning project, we needed to access the plasmid maps of an old construct of mine (pSecTagN2, which was a precursor of pMosaic, aka pSecTagI) which I had composed from many different sources in 1999. In 1999, I was working in Kari Alitalo's laboratory as a Ph.D. student. We were at the "cutting edge" of technology because we used a software program called Gene Construction Kit (GCK) to keep track of our clonings. Last Wednesday, I spent 4 hours of my working time opening one file created with GCK version 2.5 in 1999.

Gene Construct Kit
GCK has had the peculiarity that it did not store the history of the construction. However, it was possible to manually create so-called Illustration files (.gci file extension), which could be used to manually document complex cloning projects and to create figures for manuscripts or presentations. I had indeed created such a .gci file for the construction of pSecTagI, which does contain a detailed description of the cloning. And since I never delete anything, the file was still in my "Documents/Sequences/GCK/pSecTag/pSecTagI" folder. But how to open it? The actual GCK sequence files can be opened by the contemporary DNA construct mapping and cloning software SnapGene. We had used GCK until 2013 when we replaced it with the much better designed and more capable SnapGene. However, the .gci (Illustration) files cannot be opened by SnapGene.

Updating macOS and getting rid of constantly nagging pop-ups
To open this file I needed to install GCK on one of my computers. I needed to use my old 2013 Macbook because GCK runs only on Mac and Windows, and all my PCs are running Linux. The 2013 Macbook had been upgraded to run macOS 10.15 (Catalina) perhaps a year ago when I last time used it. After booting, I first needed to install an onslaught of updates, without which I was permanently bombarded with pop-ups that asked me to bin "Pulse Secure" and "navlibx". After about 2 hours, I had downloaded and installed all updates and was even able to manually delete "Pulse Secure" (Pulse Secure was the horrendously insecure "secure communication" tool that the university IT wanted the whole university to use a few years back. IT soon realized that the license fees would be too expensive to allow everybody to use a VPN and therefore our university was running two parallel VPN solutions: Pulse Secure and OpenVPN (the latter being free). Then it appeared that Pulse Secure was just pretending to be secure. Hopefully, IT has learned the lesson that you cannot buy security with money. But of course, at the moment our IT is again hopelessly behind as everybody else is moving to WireGuard, which is conceptually easier and better performing.

GCK 4.5 does not run under macOS 10.15 Catalina and never will
After downloading and installing GCK (which had been meanwhile updated to version 4.5), the app icon appeared greyed out and crossed out: This app is too old to run on macOS 10.15 (which is also already 2.5 years old). Unlike Windows, which goes a long way to maintain backward compatibility, macOS seems constantly to make changes that break older software. I went to the GCK homepage but could not find anything in the FAQ section. I needed Google to help me find this blog post from the GCK developers: https://www.textco.com/blog/2020/09/gck-gi-not-supported-under-mac-os-10...

I was not very surprised to realize that GCK is now almost abandonware. GCK has not received any feature updates for perhaps 10 years. However, now it smells like fraud to continue selling this software that does not even run on fully patched operating systems. I don't know whether the Windows version of GCK does run under Windows 11, but I would assume that it does. But my only Windows 11 machine is in use in the university because it is the only OS that was able to operate our Bio-Rad FPLC after we had upgraded its firmware (I assume that it is not Windows 10, which is the problem, but rather our university's add-ons, likely its firewall configuration or antivirus software).

Virtualized macOS <10.15 needed: Parallels
So I need to run an earlier version of macOS in emulation mode. VirtualBox would be my go-to choice. Running macOS as a guest OS is possible with VirtualPC, but IRL very difficult to pull off (I tried it once without success). The recommended solution is Parallels. Unfortunately, Parallels discontinued its free Parallels Lite version about 1.5 years ago. I even do have an old license for the full version, but the installation medium and the licensing information are in the lab and I am working from home. I downloaded the trial version guessing that one month would be ample time to convert the GCK Illustration files into .... good question! Into what? I guess the only possibility would be to print these illustrations to PDF files.

Parallels is not enough, we need an old installation image
After installation, Parallels found immediately the 10.14 installer which was still on my hard drive from the update to HighSierra about 3 years back. However, the installation failed (although the upgrade to HighSierra had succeeded with the same image). I tried to read a physical installation disk with MacOS 10.7 (Lion) from the SuperDrive, but Parallels failed even to see the disk. So I started to create an image of the disk, which took another hour: the installer DVD is 9 years old. But the installation from the image file succeeded.

Getting files in and out of the virtualized MacOS 10.7
Getting the file into the virtual machine was another near-impossibility. I first thought that Parallels must have inbuilt file sharing between the guest and the host machine. It indeed has, but I was unable to find the guest add-ons that are required for this to work. Then I thought about using the local network to transfer files. That seemed to work, but when the connection was supposed to be established, I merely received the error message "Connection failed". Then I thought about logging into my Gmail account in the guest and mailing the file to myself. Interestingly, none of the latest versions of Safari, Firefox or Chrome for MacOSX 10.7 did manage to log into my Gmail account. After that, I concluded that I would need to put the file somewhere on the internet and then simply download it with a browser. So I uploaded the file to my webserver and made it publicly available. However, the only browser that was able to connect was Chrome but it also refused to download the file because the System clock of my computer was incorrectly set (which must be a bogus error message since it was correctly set using Apple's European time server). To enable Parallels' host to guest file sharing feature, I started with updating the MacOS 10.7 (Lion), again wasting perhaps half an hour. After that, the installation medium was indeed mounted and I could install the Parallels Tools and enable file sharing and finally managed to copy the file to the virtual machine's Document directory.

Printing to PDF
The file opened without problems. GCK itself can only export into JPG and PICT files. I have seen PICT files last time perhaps in 1999, and JPG files are of atrocious quality. "Publication-quality illustrations" seems a bit far-fetched. However, I thought to print to a PDF file. But since there was no printer installed, I could not print to PDF. To my understanding, Apple's Quarz is essentially PDF, but why can't I print to PDF then? I remember that I have done so in the old days, but now this option was completely missing as I did not even get a print dialog when I clicked the "Print" button in the GCK UI. But printing to PDF worked normally when I tried to print from other 10.7 apps. Only the GCK program did not even want to give me a print dialog.

Dummy printer driver
I guessed I would need to install a printer driver to get a print dialog. However, how to do so without having a printer? To do that I just used IP-printing pointing to a fake IP address on the local network. After a long time trying to contact the printer, OSX gave up and offered me the possibility to proceed without any answer from the printer. After that, I was able to print to PDF. However, the only way to get the A3 paper size (to fit the whole construction illustration to one page) is to set the default paper size to A3 and then use the Preview function. If you print straight to PDF the driver will split up large illustrations to several A4 pages. Only saving the print preview will result in a PDF file using A3 page size thus keeping larger illustrations on one page.

Take-home message
10 years are enough to make the recovery of some digital documents nearly impossible. The more niche software is, the bigger the problem. If it uses an undocumented, proprietary file format, the situation can become unsolvable. I have a couple of vector maps in MacVector format (version 6 I believe). These were created by Marika Kärkkäinen perhaps around the same time as my GCK file. However, I do not own a license for a MacVector version that is able to run on MacOSX or macOS, and hence I lost my ability to open these files since perhaps 15 years ago when the MacOS9 compatibility layer was removed from MacOSX. I have kept my old PowerBook G3 Kanga in my garage (which I bought for 27000 Finnish marks in early 1998), but I have no clue whether it still would boot at all. While checking whether I am up to date with my information, I just realized that there is now a free version of MacVector available with a reduced feature set. At least I can open Marika's old plasmid maps (provided I have a functional macOS computer).

What about Snapgene?
We are using at the moment SnapGene to plan and document our clonings. It is - similar to GCK - a proprietary software and could vanish any day without us having any recourse. While SnapGene is vastly superior compared to GCK, commercial companies don't create software because they want to make our lives easier. They create software to earn money and as soon as they stop earning money, they will stop supporting the software. SnapGene has become the de-facto industry standard and it appears unlikely that they would go away any time soon. But the same was true for Vector NTI 10 years ago, but Thermo Fisher killed it anyway last year. Therefore, it would be important to lobby for opening up the proprietary SnapGene file format. Or at least start to reverse-engineer it.