Data Sources

Dataset Builder File Types

CSV vs. JSON Lines Files The dataset builder creates two files: A CSV file containing only metadata A JSON Lines file containing metadata and the textual data The textual data includes: Unigrams Bigrams Trigrams Full Text (where available) The metadata may include: Column Name Description id a unique item ID

Pre-Built Datasets from the Builder

Archaeology American Journal of Archaeology (1897-2020) 02b8c5c7-64bd-efe3-01d8-88c9efe7d17c Classics Classical Quarterly (1907-2014) 82014740-8ed9-3c34-5716-d0879b8317f6 English Negro American Literature Forum (1967-1976) + Black American Literature Forum (1976-1991) + African American Review (1992-2016) b4668c50-a970-c4d7-eb2c-bb6d04313542 Shakespeare Quarterly (1950-2013) f6ae29d4-3a70-36ee-d601-20a8c0311273 ELH (1934-2014) 4999901a-fa17-31da-cfe5-2abf3a429df7 College English (1939-2016) a161f384-720b-b6bf-a0cc-4d7d3b857e1c PMLA (1889-2014) 1aea53b9-26d5-fe54-e35c-8259156ce6cd History Past & Present (1952-2014) 5e117960-e384-b705-b143-5a667fe614f0 English Historical

Can I download a dataset I created in your builder?

Download a dataset created in the builder You can download your full JSON-L dataset from the corpus builder in the link shown below. (You may also have a link to your dataset in your email.) Download a dataset from The Jupyter Notebook If you have used the tdm_client to

Data Sources in our Builder

Our environment allows data from any source to be brought in by upload from your local machine. You may also bring in data from an API using your own code. We maintain a dataset builder that will create custom datasets. We prioritize adding new sources based on community desire and

Working with Dataset Files

Description: This notebook describes how to: Read and write files (.txt, .csv, .json) Use the tdm_client to read in metadata Use the tdm_client to read in data This notebook describes how to read and write text, CSV, and JSON files using Python. Additionally, it explains how the tdm_

Can I create or use my own dataset?

Absolutely. You can upload a dataset and run any code on it you like. Keep in mind, however, that the notebooks JSTOR has written may need to be modified if your dataset is not in the right format. Read more about our format. We also recognize that folks may want

Join the community

Join our email list for information about new content, lessons, features, and webinars.

You've successfully subscribed to Digital Scholar Workbench
Great! Next, complete checkout for full access to Digital Scholar Workbench
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.