How to download large files from DataverseNL?

Modified on Thu, 22 Feb at 3:18 PM

Sometimes datasets can contain large files. If you would like to download multiple files at once, or a whole dataset, the system will create a zip-file for the files you have selected. However, our zip download limit is 10 GB, this to prevent capacity problems. If the zip-file has reached the limit, it won't be complete. The zip-file will then include a file called 'MANIFEST.txt'. This text file will state that some files were omitted because of the imposed download limit. 


If you encounter this, you could try to download the files separately via the UI. The system will not make a zip in this case. However, if this is a tedious job because of the amount of files, you can try the following method.


You can get a view of the dataset files and folders as a directory index, and use wget (or a similar crawling client) to download all the files in a dataset.  


For example:

https://dataverse.nl/api/datasets/:persistentId/dirindex?persistentId=doi:10.34894/0W2ROJ


curl "${SERVER_URL}/api/datasets/:persistentId/dirindex?persistentId=doi:${PERSISTENT_ID}"


An example of a wget command line for recursive downloading of files and folders in a dataset: 

wget -r -e robots=off -nH --cut-dirs=3 --content-disposition https://demo.dataverse.org/api/datasets/:persistentId/dirindex?persistentId=doi:${PERSISTENT_ID}


More detailed information can be found in the General User Guide for Dataverse:

https://guides.dataverse.org/en/6.0/api/native-api.html?#view-dataset-files-and-folders-as-a-directory-index 

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article