The demand for analytics skills across all domains is growing exponentially. Text and data analysis is one of those skills, yet it remains difficult to learn. Researchers and students are often teased by black box, point-and-click tools that produce a few quick visualizations that whet the appetite; however, the next step in learning text analytics is a high one and requires students to learn statistics and programming.

Constellate is brought to you by the ITHAKA services, JSTOR and Portico. Constellate's primary goal is to make it easier for anyone to learn these text analytics skills by creating a learning platform that empowers faculty, librarians, and other instructors to educate a generation in text and data analysis. We provide users with the ability to build datasets for analysis from a variety of sources and provide a gathering space for the growing community of practitioners.

Our solution is centered on student and researcher success, providing text and data analysis capabilities and access to content from some of the world’s most respected databases in an open environment with a variety of teaching materials that can be used, modified, and shared.

Summary of the Platform


Constellate provides value to users in three core areas -- they can teach and learn text analytics, build datasets from across multiple content sources, and visualize and analyze their datasets:

Learn & Teach

  • Template and Tutorial Code: Work with template Jupyter Notebooks to analyze your dataset and learn about text analytics (with additional environments forthcoming, such as R Studio).
  • Lessons and Documentation: Lessons and educational materials created by a community of experts, including those from the NEH-funded Text Analysis Pedagogy Institutes.
  • Collaborative Teaching Materials Creation: Users may create, edit, reuse and collaborate in the creation of tutorials, code, documentation, and other educational resources for text analysis (our tutorial notebooks are all available in GitHub, in addition to being accessible for use in our Analytics Lab).

Build

  • Multiple Collections: Anchor collections from JSTOR and Portico, with additional content sources continually added (such as Library of Congress’ Chronicling America). Further details about the collections are available.
  • Data Download in JSON
    • All content - bibliographic metadata, unigrams, bigrams, trigrams
    • Open content - bibliographic metadata, unigrams, bigrams, trigrams, full-text
  • Dataset Dashboard: Easily view datasets you have built or accessed.

Analyze

  • Analytics Lab: Integrated computational environment powered by BinderHub that will allow users to seamlessly analyze text content using provided template Jupyter Notebooks and tutorials
  • Visualize: Built-in visualizations for your datasets
  • Work with Rights Restricted Full-Text: Access to substantial compute cycles to work with the full-text of rights restricted content (forthcoming in late 2021 -- until then, it is possible to request JSTOR content through a personal agreement)

Roll Out

We are rolling out the subscription service by offering a six-month beta evaluation to institutions that participate in JSTOR or Portico. It is important to us that the platform be as widely available as possible, while also covering our costs, and to that end there will always be a tier of service available to individuals for free that improves on JSTOR’s self-service Data for Research (DfR) functionality (see our documentation about the differences between this new platform and DfR).

Institutional participants in the free trial will be able to provide their users with additional computational power in the Analytics Lab and participate in training sessions:

Non-Trial Users Trial Participants
Build
Build & visualize datasets up to a specified number of items 25K 50K
Download datasets up to a specified number of items 25K 50K
Analyze
View and download built-in visualizations for datasets
Access to computational environment resources sufficient for: Learning Teaching & research
Computational environment with learn to text mine notebooks
Compute environment - CPUs <Core Tier 4 cores
Compute environment - maximum memory 2 GB 8 GB
Unlimited simultaneous users in computational environment
Learn
Adopt, adapt, and contribute tutorials and documentation
Run institutional users’ (instructors, students, etc.) repositories of code in our computational environment
Attend our Train-the-Trainer workshops
Attend our four week Learn Text Analytics Course


This free, beta evaluation period is all about learning.  It will help institutions gauge the demand on their campus for this tool and the effort to implement it.  It will help us assess the amount of usage the platform may see to more accurately estimate costs and determine appropriate fees.

If you are interested in signing up for the free trial, please indicate your interest by filling out this form and we will be in touch to set-up a video chat.

We are also running two webinars to introduce Constellate and the beta evaluation period, which you are welcome to join:

Subscription Service

In the second half of 2021, we expect to offer institutions subscriptions to a paid tier of service sized to be used for teaching and learning. We do not yet have pricing for these subscriptions. We want to balance the need to both cover our costs and keep these subscriptions reasonably priced. The beta evaluation period associated with our 2021 launch will help us and our institutions evaluate both cost and value. By the end of 2021, we plan to offer an additional tier of service aimed at meeting the more substantial demands of advanced researchers requiring computing power and access to the full text of rights restricted content. If you are an advanced researcher interested in exploring with us what might meet your needs, please let us know at tdm@ithaka.org.