Dataverse Github integration

Modified on Mon, 12 Feb, 2024 at 1:44 PM

12-02-2024

Laura Huis in ‘t Veld


Dataverse version 6.0


From the User Guide:
Dataverse integration with GitHub is implemented via a Dataverse Uploader GitHub Action. It is a reusable, composite workflow for uploading a git repository or subdirectory into a dataset on a target Dataverse installation. The action is customizable, allowing users to choose to replace a dataset, add to the dataset, publish it or leave it as a draft version on Dataverse. The action provides some metadata to the dataset, such as the origin GitHub repository, and it preserves the directory tree structure.

For instructions on using Dataverse Uploader GitHub Action, visit https://github.com/marketplace/actions/dataverse-uploader-action

 

Dataverse Uploader Github Action was developed by Ana Trisovic. 
https://github.com/IQSS/dataverse-uploader/blob/master/LICENSE 


You will need:


  • The url of the dataverse instance, for example ‘https://dataverse.nl

  • The DOI of the dataset where you would like to upload your files

  • Your dataverse API token. (In Dataverse, click on your username and choose ‘API token’ from the dropdown menu to view your API token or to create one.)


Steps


  1. Go to the Github repository from where you would like to deposit in dataverse.

  2. Create a new .yml file in .github/workflows. Paste one of the code examples from https://github.com/marketplace/actions/dataverse-uploader-action

For example: 

on: 

    workflow_dispatch:


jobs:

  build:

    runs-on: ubuntu-latest

    steps:

      - name: Send repo to Dataverse 

        uses: IQSS/[email protected]

        with:

          DATAVERSE_TOKEN: ${{secrets.DATAVERSE_TOKEN}}

          DATAVERSE_SERVER: https://demo.dataverse.org

          DATAVERSE_DATASET_DOI: doi:10.70122/FK2/LVUA
DELETE: False

  1. Change the value of DATAVERSE_SERVER to match the url of your dataverse instance. 

  2. Change the value of DATAVERSE_DATASET_DOI to match your dataset DOI.

  3. Store your Dataverse API token as a secret in your repository. 

    1. Go to ‘Settings’ and click on ‘secrets and Variables>actions’ on the left panel.

    2. Click ‘New Repository secret’

    3. Name your secret ‘DATAVERSE_TOKEN.

    4. Store your Dataverse API Token in the value field and add the secret.


  1. If you would like to upload only part of your files to dataverse, you can indicate from which subdirectory files should be deposited into dataverse. 
    To do this add:
    GITHUB_DIR: <name of subdirectory> 

  2. If you want your dataset to be published directly after the upload of your files, add
    PUBLISH: True

  3. By default, the action will sync the GitHub repository and the Dataverse dataset, meaning that it will delete the Dataverse content before uploading the content from GitHub. If you don't want the action to delete your dataset before upload (i.e., if you already have a Dataverse DRAFT dataset), add 
    DELETE: False

  4. Select the workflow you have created and choose 'View Runs> ‘Run workflow’. Your files will now be uploaded to the dataset.



If you want the workflow to be triggered automatically, you can choose to run the workflow on
release

push



Display the DOI of your dataset in Github


If you would like to display the DOI of your dataset in Github, you can use this badge generator:

https://atrisovic.github.io/dataverse-badge/

For example, you can paste the html snippet with the DOI badge in the readMe file of your repository. 

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article