This post is part of a series of guides on how to write your first ACM SIGGRAPH / TOG paper. You can find the other articles here.
Your SIGGRAPH project has resulted in something new and exciting, congratulations! Releasing great open-source software is a crucial last step of sharing it with the world.
This post is a tutorial on how to make your code release a success and amplify the impact of your work. It’s part of a series of blog posts on writing your first SIGGRAPH paper.
Why release code?
Releasing good code can dramatically increase the reach of your project, benefiting you–and many others–for years to come. More people will know about your project, use your method, and cite your paper—even distant users that you might never expect! In principle, a clearly-written technical paper provides all the details needed to implement a method, but in practice the burden of reimplementation is often an insurmountable hurdle. By providing ready-to-use code, you give an easy path for others to access your work.
Additionally, software is a form of communication. It precisely describes all of the little details of your research in unambiguous language, a valuable complement to the mathematical descriptions in a paper and the verbal descriptions in talk. In my experience, some readers will find it much easier to understand a new method by reading well-documented code than by reading a paper!
Lastly, code releases greatly improve reproducibility & replicability in computer graphics, that is, the ability to reproduce the published results of a method. When canonical code is released, it serves as a public reference point for a particular method, enabling the community to move forward and collectively do better research.
The basics
First and foremost, releasing code does not mean simply uploading the project code folder from your machine! The primary act of preparing a code release is converting your “experimental” code into software which is ready to be used by others. Other important aspects include documentation, hosting, and licensing. There’s a lot to think about, and it’s okay if you can’t do everything perfectly right, but every hour you invest will pay dividends in the long run.
Before going further, I should clarify that this tutorial will focus mainly on releasing “artifacts”, one-off codebases which are typically associated with a single paper. Developing large, general libraries or maintaining complex software systems projects is a very different challenge, which has more in common with mainstream software engineering practices.
Timeline
When working on a research project, you rarely write clean, well-documented code from the start, especially near a deadline. I generally recommend preparing a code release after your paper has been accepted, but before it is presented at the venue, around the same time you are preparing a talk.
In some cases, you might be able to prepare a basic version of your code earlier, for submission as supplemental material. If possible, this is great! Some reviewers are very appreciative when supplemental code is included with a submission. If your code is not ready for review, but is ready before the “camera ready” deadline for the paper, you can generally still include it with the final submission for the official archive (though policies may change; be sure to confirm these details).
As time goes on, think of your code release as a tapering process—you do a lot of work at the beginning, a little work after the initial release to add features and fix some bugs, and then later you just fix the occasional issue or update broken dependencies. Unlike managing a large living software project, paper artifacts rarely require significant time commitments over extended periods of time.
Hosting
Where should you actually put your open source code? These days, public version control websites like github and bitbucket are excellent options. They are reliable, searchable hosts, and seem likely to persist for many years to come (although they are not without some criticisms). Create a repository for your project, either under your personal account or some organizational account, and start committing/uploading code. For this purpose, it is even okay to do a single large “upload”, committing all of the files at once. If your repository is ever publicly visible in an unfinished state, add prominent text indicating as much.
Hosted version control like github can be used in addition to releasing a zip file archive of your code. Ideally, an initial version of your code can be submitted with your paper to the ACM Digital Library, while a github page offers a discoverable and updatable home for your code, incorporating new features and fixes.
That being said, if short on time, releasing on github is probably the more impactful aspect.
Licenses and ownership
Software licenses dictate what others can do with your code, particularly in terms of (a) modifying & redistributing it, and (b) making money from it. This is a complicated subject—I am not a lawyer, and this is not legal advice!
In short, I strongly recommend an “MIT License” for research code. It is a very permissive license, which allows others to modify your code however they want, and even to use it in for-profit projects. At first, this sounds bad—what if someone steals your code? And don’t you deserve to earn some profit? But in reality, both of these outcomes are very rare, and on the positive side a permissive MIT License will further increase the impact of your code; any more restrictive license would create a significant barrier for those in the industry to use your software.
There are some exceptions. Your employer, your funding source, or those of your coauthors may place restrictions on how you can license the resulting code. Or perhaps you have an actionable plan to commercialize or otherwise profit from your research. If so, there are many other licenses available, and a code release is still very beneficial! Discuss these concerns early in your project to ensure there are no misunderstandings among the authors, and consider seeking real legal advice if you wish to retain exclusive commercial rights to your code.
7 tips for a great code release
The specific steps you follow to release code will vary greatly depending on the project and your own work habits. Rather than trying to list all possible steps, the rest of this tutorial will be a loose collection of concrete tips and best practices to help you be successful.
1: Make time to do it right
The single most important ingredient for a successful code release is you, the author, committing to invest time to make it happen. You might spend months preparing and fine-tuning the text of a SIGGRAPH paper to share your research; shouldn’t you at least spend a week releasing code, which is another important channel for sharing research? It may be difficult, because releasing code comes late in the process when a project is seemingly “done”, but investing that last week to release quality code will be worth it in the long run.
2: Libraries vs applications
For some projects, code releases are best structured as a library: a collection of functions or data structures that are integrated into other software. Other projects are better suited as an application: desktop/web/mobile programs which are driven by a human to create an experience or process some input. Think carefully about which approach makes more sense for work—how do you imagine others might want to use your code?
The best-of-both-worlds approach is to release code as a library and as an application. Implement your project as a library with general-purpose functions which can be called for key subroutines, and then build a separate standalone application on top of that library.
If you don’t have the time time to fully reorganize your codebase, focus on preparing the minimal necessary functionality that will allow a user to try out your method, or reproduce a key result from your paper. Once they have seen it work, they will be more willing to dig through your code and adapt it for their purpose.
3: Keep it small
It is not necessary to include every single thing which appears in your paper in your code release. This is especially true in SIGGRAPH papers, which might include a wide variety of demonstrations and comparisons. Identify the core functionality of your work, and condense it down to a simple command line tool, an easy to use GUI, or even a single function that can be called.
Various advanced features can be added to your code release as-needed, and it is okay if they are less well-documented or more difficult to use than the core functionality. Invest most of your energy preparing excellent code for the simplest, central idea of your project.
4: Code with users in mind
When we write code for a research project, we are motivated by “how can I perform this experiment?” When releasing code, instead ask “what will others want from this code?”
For instance, if someone else wants to pass their own data in to the code, what file formats might they want to use? How might they want the output? What pitfalls are they likely to encounter that can be anticipated and handled intelligently?
Let these kinds of questions guide you in preparing your code for release. If you find it even a little difficult to use your own code, then outsiders will find it extremely difficult to use!
5: Ready-to-run examples
Include at least one extremely easy-to-run sample in the repository. This provides a direct path for anyone to get started with the code!
If input data is needed, package at least one sample input in a subdirectory. For a command line application, provide copy/paste-able instructions to install and run the software. For a GUI application, include explicit step-by-step screenshot guides. The more direct, the better.
6: Documentation and images
Documentation is accompanying text which explains the usage and function of code. Documentation can take the form of inline comments in code, external documentation pages, or simply a text file—any of these is perfectly fine! What matters is that you include clear and precise text explaining what your code does. It is most important to document the public-facing portions of your code that a user actually interacts with, but documenting internals is also a valuable exercise in technical communication.
Readers are always drawn to appealing visuals. If possible, include exciting figures to show users what they will get from your code. These images will draw in casual passers-by, and advertise your work. Include at least one exciting image in your project’s top-level README or introduction, and use diagrams and figures within documentation wherever possible. Reuse your paper figures if you can!
7: Cross-platform code & bit rot
It’s very easy to write code that works great on your computer, but can’t be used on anyone else’s—and if you think it’s bad now, imagine trying to run your code in 20 years!
Most importantly, be sure to try compiling/running your software on machines other than your own before your release it. More generally, try hard to minimize the number of other software packages that your code depends on—simple dependency-free terminal applications are much more likely to survive the test of time than complicated applications built on frameworks which may be abandoned in 20 years.
Carefully document the versions of any dependencies which you do need, as well as steps used to compile or run your software. Include beginner-friendly instructions to help users unfamiliar with your toolchain run your code now and in the future.
Wrapping up
Hopefully this tutorial gives some guidance on releasing code for your first SIGGRAPH paper. Remember, all of this is a lot to get right, and you don’t have to do it all perfectly the first time! Your best-effort will still be greatly appreciated and valuable. Check out the other posts in this series for more tips on writing your first SIGGRAPH paper.
This post is part of a series of guides on how to write your first ACM SIGGRAPH / TOG paper. You can find the other articles here.
We thank Lingxiao Li for proofreading.