The Publication Harvester is a software tool that downloads publications from PubMed, stores them in a database, and generates an accurate count of publications for a set of people. The harvester uses a set of possible name variations for that individual, and records the list of authors. The goal of the software is to gather large amounts of data about specific people from PubMed for statistical analysis. It records the people, publications and publication data in a database, and generates reports based on that data.
The Publication Harvester software runs on Windows Vista and XP. It was written in C#, and requires .NET Framework 3.0.
The user manual describes installation and use of the Publication Harvester software:
The software requirements specification that was used used to develop and maintain the software can be found here.
For more information, see PublicationHarvester: An Open-Source Software Tool for Science Policy Research (Research Policy 35 (2006) 970.974).
Software downloads:
Quick start:
More detailed installation instructions can be found in the user manual (see below).
The following sample files may be helpful:
Did the sample input file work for you, but when you put together your own People file you didn't get the results you were expecting? Take a look at this guide to troubleshooting your People file.
A few people have reported trouble running the Publication Harvester software. They found that they were getting errors that look like this:
[Microsoft][ODBC Text Driver] Too few parameters. Expected 2. Could not find installable ISAM This application has failed to start because msaccess.exe was not found
The instructions in this guide to troubleshooting problems reading input files helped them resolve the problem.

This software is released under the GNU General Public License (GPL).
The Publication Harvester project is maintained by Andrew Stellman of Stellman & Greene Consulting. If you have questions, comments, patches, or bug reports, please contact pubharvester@stellman-greene.com.