pyDataverse — pyDataverse 0.3.1 documentation (original) (raw)
Release v0.3.1.
pyDataverse is a Python module for Dataverse you can use for:
- accessing the Dataverse API’s
- manipulating and using the Dataverse (meta)data - Dataverses, Datasets, Datafiles
No matter, if you want to import huge masses of data into Dataverse, test your Dataverse instance after deployment or want to make basic API calls:pyDataverse helps you with Dataverse!
pyDataverse is fully Open Source and can be used by everybody.
pyDataverse is not supported right now. A new maintainer or funding is desired. Please contact the author Stefan Kasberger, if you want to contribute in some way.
Install¶
To install pyDataverse, simply run this command in your terminal of choice:
Find more options at Installation.
Requirements
pyDataverse officially supports Python 3.6–3.8
Python packages required:
- requests>=2.12.0
- jsonschema>=3.2.0
External packages required:
- curl (only for replace_datafile() necessary)
Quickstart¶
Warning
Do not execute the example code on a Dataverse production instance, unless 100% sure!
Import Dataset metadata JSON
To import the metadata of a Dataset from Dataverse’s own JSON format, use ds.from_json(). The created Dataset can then be retrieved with get().
For this example, we use the dataset.json
fromtests/data/user-guide/
(GitHub repo) and place it in the root directory.
from pyDataverse.models import Dataset from pyDataverse.utils import read_file ds = Dataset() ds_filename = "dataset.json" ds.from_json(read_file(ds_filename)) ds.get() {'citation_displayName': 'Citation Metadata', 'title': 'Youth in Austria 2005', 'author': [{'authorName': 'LastAuthor1, FirstAuthor1', 'authorAffiliation': 'AuthorAffiliation1'}], 'datasetContact': [{'datasetContactEmail': 'ContactEmail1@mailinator.com', 'datasetContactName': 'LastContact1, FirstContact1'}], 'dsDescription': [{'dsDescriptionValue': 'DescriptionText'}], 'subject': ['Medicine, Health and Life Sciences']}
Create Dataset by API
To access Dataverse’s Native API, you first have to instantiateNativeApi. Then create the Dataset through the API withcreate_dataset().
This returns, as all API functions do, arequests.Response object, with the DOI inside data
.
Replace following variables with your own instance data before you execute the lines:
- BASE_URL: Base URL of your Dataverse instance, without trailing slash (e. g.
https://data.aussda.at
)) - API_TOKEN: API token of a Dataverse user with proper rights to create a Dataset
- DV_PARENT_ALIAS: Alias of the Dataverse, the Dataset should be attached to.
from pyDataverse.api import NativeApi api = NativeApi(BASE_URL, API_TOKEN) resp = api.create_dataset(DV_PARENT_ALIAS, ds.json()) Dataset with pid 'doi:10.5072/FK2/UTGITX' created. resp.json() {'status': 'OK', 'data': {'id': 251, 'persistentId': 'doi:10.5072/FK2/UTGITX'}}
For more tutorials, check outUser Guide - Basic Usage andUser Guide - Advanced Usage.
Features¶
- Comprehensive API wrapper for all Dataverse API’s and most of their endpoints
- Data models for each of Dataverses data types: Dataverse, Dataset and Datafile
- Data conversion to and from Dataverse’s own JSON format for API uploads
- Easy mass imports and exports through CSV templates
- Utils with helper functions
- Documented examples and functionalities
- Custom exceptions
- Tested (Travis CI) and documented (Read the Docs)
- Open Source (MIT)
User Guide¶
Contributor Guide¶
Thanks
To everyone who has contributed to pyDataverse - with an idea, an issue, a pull request, developing used tools, sharing it with others or by any other means:Thank you for your support!
Open Source projects live from the cooperation of the many and pyDataverse is no exception to that, so to say thank you is the least that can be done.
Special thanks to Lars Kaczmirek, Veronika Heider, Christian Bischof, Iris Butzlaff and everyone else from AUSSDA, Slava Tykhonov and Marion Wittenberg from DANS and all the people who do an amazing job by developing Dataverse at IQSS, but especially to Phil Durbin for it’s support from the first minute.
pyDataverse is funded byAUSSDA - The Austrian Social Science Data Archiveand through the EU Horizon2020 programmeSSHOC - Social Sciences & Humanities Open Cloud(T5.2).
License¶
Copyright Stefan Kasberger and others, 2019-2021.
Distributed under the terms of the MIT license, pyDataverse is free and open source software.
Full License Text: LICENSE.txt