diggersdiaries.org (original) (raw)
Reader
The reader allows you to read the pages of the diaries in, at least two ways:
Explore by pages
Visualize more over 80.000 thousend pages from the diaries, order and read them
Explore by authors & diaries
See the list of diaries and their authors. See main topic of each diary and read them.
Explore by time
See diaries ordered by date and read them too
Most common topics by group
Maximum topic model score per pages:
Most common topics
Open data
This project uses data from the collection World War I Diaries, owned by the State Library of New South Walles (Australia).
All the code created to gather, transform and represent the said collection is available online at a Github account under the last version available of the licence that the SLNSW recommends in its Terms of Use.. This is Creative Commons attribution, non-commercial, share alike 3.0 Australia
The data from the collection is accessible through the SLNSW API. The transformation of the data from the collection is also available in the format generated for this project.
Follow the instructions to get the data:
- diaries.json: list of diaries () and metadata (author, title, dates, kind, SLNSW URL, topics score, library ID))
- Diary detail: each page of the diary (page ID, topic score)
- Transcriptions: every page in html format is accessible through data/transcriptions/[diaryID]/[pageID].txt
- Diary topic images: images generated for each diary showing first topic group of each page are directly accessible.
- Top topic pages: top scores are accessible too
- topics.json and terms.json: json files for topics and terms list.
Credits
This is part of a creative-arts PhD by Jaume Nualart Vilaplana at the Faculty od Arts and Design (University of Canberra)
- Research & development: PhD Candidate Jaume Nualart Vilaplana
- Mentoring and supervising: Dr Mitchell Whitelaw, Ass. Prof. Digital Design & Media Arts, Faculty od Arts and Design (University of Canberra)
- Natural Language Processing adviser: Dr Gabriela Ferraro, Text Analytics Researcher at DATA61, CSIRO (Australia). Adjunct Research Fellow at the College of Engineering and Computer Science, Australian National University (ANU).
This project is free software and it uses several external librariess and resources:
- Javascript: AngularJS, Jquery
- Data process: Python, shell, R, ImageMagick
- Interface: bootstrap 2, Glyphicons
- Analysis: Mallet
- Images: Gimp, dia
- Charts: LibreOffice, Google spreadsheet
- All done under GNU/linux OS
Contact
- email: Jaume at nualart.cat
- twitter: @jaumetet