How to build, test and run
This page explains how to install the tools you need to run OpenRefine from source and develop. This consists of:
- A Unix/Linux shell environment or the Windows command line, that should be installed on your machine already;
- OpenRefine's source code;
- a Java Development Kit (JDK) (version 11 or later);
- Apache Maven;
- Node.js and NPM (version 16 or later).
Get OpenRefine's source code
With Git installed, use the git clone
command to download the project's repo to a directory of your choice.
Set up JDK
You must install JDK and set the JAVA_HOME
environment variable (please ensure it points to the JDK, and not the JRE).
OpenRefine is known to work with Java 11 to 21.
- Windows
- Mac
- Linux
- On Windows 10, click the Start Menu button, type
env
, and look at the search results. Click . (If you are using an earlier version of Windows, use the “Search” or “Search programs and files” box in the Start Menu.)
- Click Advanced window. at the bottom of the
- In the Environment Variables window that appears, click and create a variable with the key
JAVA_HOME
. You can set the variable for only your user account, as in the screenshot below, or set it as a system variable - it will work either way.
- Set the
Value
to the folder where you installed JDK, in the formatD:\Programs\OpenJDK
. You can locate this folder with the button.
First, install Java. You can do so either with Homebrew, with brew install java
, or by downloading it from Adoptium and installing it manually.
You then need to make sure the JAVA_HOME
environment is properly set, so that OpenRefine can find your Java install.
Check the environment variable JAVA_HOME
with:
$JAVA_HOME/bin/java --version
If this shows your Java version, your JAVA_HOME
variable is set up correctly. If it shows an error, you need to adjust it.
To do so, you can use:
export JAVA_HOME="$(/usr/libexec/java_home)"
Or, for Java 13.x:
export JAVA_HOME="$(/usr/libexec/java_home -v 13)"
On Debian/Ubuntu derivatives, enter the following:
sudo apt install default-jdk
On Fedora/CentOS, use:
sudo dnf install java-devel
On ArchLinux, use:
sudo pacman -S jdk-openjdk
For other distributions, search for any JDK in your package repository: most should be compatible with OpenRefine.
Maven
OpenRefine development requires Apache Maven for its build, test, and packaging processing. We encourage using the latest version of Apache Maven for development of OpenRefine, otherwise sometimes spurious errors appear in your IDE regarding POM, dependencies, or packages.
- Windows
- Mac
- Linux
Install Maven. Then ensure the M2_HOME
or MAVEN_HOME
environment variable is set or 'mvn' is in your system PATH
:
MAVEN_HOME=E:\Downloads\apache-maven-3.8.4-bin\apache-maven-3.8.4\
Install Maven via Homebrew with brew install maven
.
Otherwise, Install Maven. Then ensure the M2_HOME
or MAVEN_HOME
environment variable is set or 'mvn' is in your system PATH
:
MAVEN_HOME=/opt/apache-maven-3.8.7
Install Maven with the package manager of your Linux distribution. For instance:
- On Debian/Ubuntu derivatives, use
sudo apt install maven
- On Fedora/CentOS, use
sudo dnf install maven
- On ArchLinux, use
sudo pacman -S maven
Other distributions are likely to offer Maven in their official package repository as well.
Node.js and npm
The OpenRefine webapp requires Node.js and npm to install package dependencies. We require Node.js 16 or newer. Download and install Node.js (On Windows, you can alternatively install nvm to easily manage multiple npm versions on your system). You should then have node and npm intalled. You can check the versions by typing:
node -v
npm -v
You can update the version of npm to the latest by typing
npm install -g npm@latest
Building
To see what functions are supported by OpenRefine's build system, type
./refine -h
To build the OpenRefine application from source type:
./refine clean
./refine build
Note that the refine
script is a wrapper over the Maven build system. You can often use Maven commands directly, but running some goals in isolation might fail (try adding the compile test-compile
goals in your invocation if that is the case).
Testing
Since OpenRefine is composed of two parts, a server and a in-browser UI, the testing system reflects that:
- on the server side, it's powered by TestNG and the unit tests are written in Java;
- on the client side, we use Cypress and the tests are written in Javascript
To run server tests, use:
./refine test
To run the Cypress tests for the first time, you must go through the installation process..
Then, you need to run two processes in parallel:
- OpenRefine itself, ideally running off a fresh workspace directory:
./refine -d /tmp/openrefine_workspace
- Cypress, with the command
yarn --cwd ./main/tests/cypress run cypress open
We recommend running only individual test suites locally and relying on our continuous integration infrastructure to run the entire test suite, as this is rather time consuming.
Running
From the top level directory in the OpenRefine application you can build, test and run OpenRefine using the ./refine
shell script (if you are working in a *nix shell), or using the refine.bat
script from the Windows command line. Note that the refine.bat
on Windows only supports a subset of the functionality, supported by the refine
shell script. The example commands below are using the ./refine
shell script, and you will need to use refine.bat
if you are working from the Windows command line.
To run OpenRefine from the command line (assuming you have been able to build from the source code successfully)
./refine
By default, OpenRefine will use refine.ini for configuration. You can copy it and rename it to refine-dev.ini
, which will be used for configuration instead. refine-dev.ini
won't be tracked by Git, so feel free to put your custom configurations into it.
If you wish to run the application manually, without using the refine
script, you can do so via Maven with mvn exec:java
. The entry point of the application is the com.google.refine.Refine
class.
Building distributions (packaged versions)
The Refine build system uses Apache Maven to automate the creation of the installation packages for the different operating systems. The packages are currently optimized to run on Mac OS X which is the only platform capable of creating the packages for all three OS that we support.
To build the distributions type
./refine dist <version>
where 'version' is the release version.
Developing with Eclipse
OpenRefine' source comes with Maven configuration files which are recognized by Eclipse if the Eclipse Maven plugin (m2e) is installed.
At the command line, go to a directory not under your Eclipse workspace directory and check out the source:
git clone https://github.com/OpenRefine/OpenRefine.git
In Eclipse, invoke the Import...
command and select Existing Maven Projects
.
Choose the root directory of your clone of the repository. You get to choose which modules of the project will be imported. You can safely leave out the packaging
module which is only used to generate the Linux, Windows and MacOS distributions.

To run and debug OpenRefine from Eclipse, you will need to add an execution configuration on the server
sub-project.
Right click on the server
subproject, click Run as...
and Run configurations...
and create a new Maven Build
run configuration. Rename the run configuration OpenRefine
. Enter the root directory of the project as Base directory
and use exec:java
as a Maven goal.
This will add a run configuration that you can then use to run OpenRefine from Eclipse.